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THORNDIKE’S C.A.V.D. IS FULL OF G 
KARL J. HOLZINGER 
University of Chicago 


The chief point of this note is to show that Professor Thorndike’s 
well-known C.A.V.D. Intelligence Test may be thought of as saturated 
with Professor Spearman’s g rather than with a number of group 
factors. Professor Thorndike’s interpretation of three Spearman 
laws of cognition will also be touched upon. This second point is 
doubtless a very dangerous one because the writer is not a psychologist 
and has only a moderate g, while Professor Thorndike is not only a 
pure psychologist, but must have a g that approaches the colossal. 

The data for the first point are furnished in Professor Thorndike’s 
Measurement of Intelligence.! Here the C.A.V.D. test is compared 
with other intelligence examinations designated as follows: 


i I. I i, ab wil we bee oareldlen aabies Xx 
Otis Self-administering Test............... 2. ccc ccc cee eee, X: 
Terman Group Test (Grades VII-XII)....................... Xs 
se ec cece mehaeiestec tio dese X. 


The intercorrelations of these four tests are given by Professor 
Thorndike and may be written in the form 














Xi X2 X3 X; 
| 
OS A hs .87 .94 .78 
X2.. ae .88 .77 
__ See ee ae | ‘i .77 











The tetrad differences? work out 
tiosa = —,054 + .017, tioas = —.017 + .010, tisas = + .037 + .022 


‘Thorndike, E. L.: ‘‘The Measurement of Intelligence.” Bureau of Publi- 
cations, Teachers College, Columbia University, New York City. P. 96ff. 
* Professor Spearman’s: “Abilities of Man.” The Macmillan Company. 
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These tetrads may be regarded as statistically insignificant showing 
the existence of a common factor g and four uncorrelated specific 
factors s; as indicated by the equations (1). 


ti mig + 1481 
Le = Mog + Neo82 | 
XL3 = Msg + 2383 
Le = Meg + N48z | 


The quantities m; and n; (¢ = 1, 2, 3, 4) being constants. 

The existence of g being indicated it is now possible to obtain its 
correlation with each of the four tests (see Spearman Appendix). 
These correlations are as follows: 


Tig = .960, ro, = .921, 739 = .960, rag = .817 


If the probable error of these coefficients is of the order .01 or .02 it 
thus appears that C.A.V.D. and the Terman test are most highly 
saturated with g while Stanford-Binet is the least effective measure 
of this factor. 

In his own analysis of the above correlations Professor Thorndike 
points out (p. 96, op. cit.) that “‘Intellect C.A.V.D. is very much the 
same as that which is measured by representative examinations for 
so-called intelligence.’”’ Professor Spearman’s method of correlation 
with g not only shows that these four tests are measuring the same 
‘fintellect”” but shows how well each measures this “intellect” and 
indicates that the ‘‘intellect” in common may be regarded as g as 
given in factor pattern (1). This seems to us a great advance over 
the usual crude methods of validation by correlations. The criterion 
by the Spearman method is a well defined g instead of a subjectively 
labeled test score. 

It is, of course, possible to interpret the same set of correlations in 
an infinite number of ways and to employ an infinite number of factor 
patterns other than (1). A certain western psychologist prefers to 
think of the variables as made up of an infinite number of independent 
elements even when the tetrads vanish. This is a possible interpreta- 
tion, but since it cannot be verified statistically and could be made 
regardless of the relation between correlation coefficients, it seems to 
us to have little scientific value. 

When the tetrad differences do not vanish, group factors are pre- 
sented and more elaborate factor patterns may be employed. _Illus- 
trations of this sort are given by Professor Kelley! and by Professor 


(1) 








1 Kelley, T. L.: “Crossroads in the Mind of Man.” Stanford University 
Press. 
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Spearman (loc. cit.). Some of the difficulties arising from the use of 
such elaborate patterns are the complexity or total lack of adequate 
probable errors to test the theory, and the great difficulty in attaching 
meaning to the numerous factors employed. 

By way of illustrating these points, we may take an example from 
Professor Kelley’s book (p. 97ff.). The four tests used are: 

X, = Reading speed. 

X; = Arithmetic power. 

X; = Memory for words. 

X, = Memory for meaningful symbols. 


The intercorrelation and tetrad differences are as taken from a paper 
by the writer.' 











X X2 | a ee 
Mint EOE eee ener ae | 0586 | .1950 | 2969 
OS EE thie te | 1487 2489 
aE EES SN peas bia | 6693 
| | 





tiexg = —.010 + .037 
tings = —.005 + .037 
tise = .005 + .016 


From the insignificance of these tetrads we may conclude that 
factor pattern (1) with only g common is sufficiently complex. If 
group factors are present in these four tests their effect is insignificant, 
yet Professor Kelley employs the pattern given by the following 


portion of his Table XII (op. cit.). Numbers in the table are standard 
deviations. 














Specific 
a = heterogene- B= y= | 6= = t=. 
Tests ity, maturity, | verbal number | mem- | spatial | speed 
sex, race factor | factor | ory | factor | factor Not 
| Seetee | due Chance 
Page 
| | | 
1. Reading speed... .40 | 69 she | .09 .38 .36 .28 
2. Arithmetic power 21 eae See re 3 pee 16 | .66 
3. Memory for 
se edebccc. . 66 .09 eae .56 S60 ea .36 .33 
4.Memory for 
meaningful sym- 
Weasels bce « .59 vee — .52 .36 7? .32 .39 





























‘ Holzinger, K. J.: On Tetrad Differences of Overlapping Variables. Journal 
of Educational Psychology, Feb., 1929. 
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The above example is but a fragment of Professor Kelley’s work 
on these data and is not included by way of criticism, but merely 
because the numeral work was at hand. As far as these four tests are 
concerned we hold that pattern (1) is adequate. Professor Kelley 
employs the elaborate pattern in the above table and interprets the 
common factor a as “heterogeneity, maturity, sex, and race.”’ We 
argue that the factors a, 8, 7, 6, e, ¢, etc. are insignificant in these four 
tests, and that whatever common factor is found may be regarded 
as g. 

At a meeting of distinguished psychologists last year several 
expressed grave doubt as to the meaning of g; they preferred a, 8, y, 
5, e and ¢ probably because specific labels had been attached to them. 
The writer is certainly in doubt as to the correct interpretation of g, 
but this doubt rises to complete confusion when a half dozen other 
factors are added even though they are shown to be significant. We 
can at least get the correlation between g and other variables when the 
tetrads vanish and thus approach its meaning. The bases for deter- 
mining the number, numerical value, and meaning of other general 
factors 8, vy, 5, «, ¢ etc. are much more subjective. Such factors may 
need to be added in the analysis of variables, but only when pattern 
(1) doesn’t hold as shown by the tetrads. If they are added their 
meaning should be approached with even greater care than the 
meaning of g. 

The whole point of their digression is that when we do get the 
tetrads to vanish and thus establish the adequacy of pattern (1) we 
should be very happy about it. We have arrived at a simple and a 
very beautiful statistical explanation. We may also, by the Spear- 
man technique, build up a pool of tests concentrated so highly with 
g that we will come to know its meaning more clearly than that of 
tests subjectively labelled. 

It thus appears that Professor Thorndike should be pleased with 
our findings. If he ever makes another test more replete with g he 
should be still happier. He should not try to think of the common 
factor as ‘‘heterogeneity, maturity, sex and race,’”’ but rather have 
these all eliminated and show that g is still there. 

Turning next to the interpretation of Spearman’s laws of cognition 
we may test these as follows: (1) Any lived experience tends to evoke 
immediately a knowing of its character and experience; (2) the mental 
presentation of any two characters (simple or complex) tends to evoke 
immediately a knowing of the relations between them. This law may 
be termed Eduction of Relations; (3) the presenting of any character 
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together with any relation tends to evoke immediately a knowing 
of the correlative character. This law may be called Eduction of 
Correlates. The word “immediate” here indicates the absence of 
intervening processes. 

According to Professor Spearman, intelligence includes all processes 
derived from these three principles comprising the ability to appre- 
hend experience, educe relations and educe correlates. 

The second of the above laws may be illustrated by an analogies 
test. 

Bat: Ball: Hammer: Boy: Nail: Handle: Aze. Here the characters 
or fundaments are the seven items presented. The subject may 
educe various relations between these fundaments such as: 

Bat is used for hitting ball. 

Bat has same initial letter as ball. 

Hammer has handle. 


Hammer has same initial letter as handle. 
Hammer is used for hitting nail. 


In addition to these the subject may educe certain relations between 
the relations listed above. These might include: 

is used for hitting is unlike has 

is used for hitting is same as is used for hitting 

has same initial letter as is sare as has same initial letter as 

The correct answer to the original test is ‘‘nail” arrived at by 
educing some or all of the above relations. The question then remains 
what to do with the answer “handle”’ selected on the basis of the same 
initial letter as ‘“‘hammer.” It may be argued that both answers are 
correct or that ‘‘nail” is a better answer than “‘handle” and should 
receive all or at least more credit than the latter. If the second argu- 
ment is followed, then the subject might educe a relation of the sort: 

“Use of an object is generally more important than the initial 
letter of the object.”’ 

In any case the above example appears to involve only the educ- 
tion of simple relations like the first five and more complex relations 
like the last which are dependent upon the first. 


Professor Thorndike! has commented upon these three laws as 
follows: 


There is no doubt that the appreciation and management of relations is a very 
important feature of intellect, by any reasonable definition thereof. Yet it seems 
hazardous and undesirable to assume that the perception and use of relations is all 
of intellect. In practice, tests in paragraph reading, in information, and in range 





1 Loc. cit.: Pp. 19-20. 
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of vocabulary, seem to signify intellect almost as well as opposites and mixed 
relations tests. In theory, analysis (choosing suitable elements or aspects or 
relations), and organizing (managing many associative trends so that each is 
given due weight in view of the purpose of one’s thought), seem to be as deserving 
of consideration as the perception and use of relations. Moreover, I fear that in 
all four cases we need other valuations to decide which are better relations, or 
more abstract relations, or more essential elements, or the more sagacious relations, 
or the more consistent organization, or the more desirable balance of weights, and the 
like. (Italics are Thorndike’s.) 


To the unpsychological mind of the writer the “‘other valuations” 
cited above are largely, if not entirely, eductions of relations between 
relations. Thus the correct solution of the Bat: Ball problem probably 


rests chiefly upon eduction of the sort: 
use of an object is more important than the initial letter of the object. 


This seems to us just as much an eduction as the case, 


Bat is used for hitting ball. 
In fact nearly all of the “evaluation” cited by Professor Thorndike 
appears to us as Spearman ‘“‘eduction.”’ 

As a further illustration one may take Professor Thorndike’s Task 
1 on p. 163 (op. cit.). The subject is to make a true and sensible 
statement, filling one word in each of the blank spaces: “The... 
way to. . . is by airplane.”’ The filling of these spaces appears to 
involve eduction of relations between fundaments of a sort. Thus 
the second space might be filled with such words as “‘ travel, die, swim,” 
etc. Likewise the whole new phrase ‘‘ way to travel is by airplane,” 
or ‘‘way to swim is by airplane” may be treated as a new fundament 
and the first place filled by eduction resulting in words like “best, 
fastest, cheapest,’’ etc. Furthermore the completed series may 
be related to the fundament “previously experienced sensible fact,” 
and several such completed series related by the eduction ‘more 
sensible.”” Thus the subject may arrive at the sentences: 

1. The quickest way to die is by airplane. 

2. The cheapest way to travel is by airplane. 

3. The quickest way to travel is by airplane. 

4. The slowest way to swim is by airplane. 

The choice of the ‘“‘most sensible” completion might be obtained 
by a series of eductions between the above completions. According 
to Professor Thorndike this choice is a matter of “decisions” as to 
‘‘most sagacious” relations, etc. How the subject could make the 
‘‘most sagacious” selection without eduction we do not see, and if all 
such decisions, valuations and sagaciousness are ruled out of Spear- 
man’s laws he has instead of Laws of Intelligence anything but that. 
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TETRAD-DIFFERENCES FOR NON-VERBAL 
SUBTESTS* 


WILLIAM STEPHENSON 


University College, London 
INTRODUCTION 


The work to be described constitutes an introduction to intensive 
research on questions of verbality and thought; but the present series 
of papers concerns primarily the satisfaction of the theory of factors 
by data obtained by use of verbal and non-verbal subtests. The 
question of the factor content of certain non-verbal subtests is of 
first importance, and is our immediate concern; later, we shall examine 
comparable data for a set of verbal subtests, and ultimately we shall 
use the one kind of subtests as ‘“‘reference values’” for the other. 

By verbal subtests we refer to g-tests involving words, or phrases 
or other complex linguistic structures as fundaments; the non-verbal 
g-tests are set in spatial, perceptual, or pictorial or other ostensibly 
non-verbal fundaments. 

The recent researches of Line’ and Fortes* are indication of the 
present detailed attention given to non-verbal subtests. The interest 
in such subtests follows from the theory (and obtained facts) of the 
universality of the Spearman g-factor, that which appears to charac- 
terise eductive processes’ no matter under what conditions they 
function. The work of Davey? is recalled, where the g-factor was 
found to cover pictorial as well as verbal subtests. But, the correla- 
tional data gathered by Davey, myself, Line, and Fortes, are for 
populations of the order 100 only and, to have a more secure foundation 
of theory and supporting fact, it seems that some data should be got 
from large populations, of the order 1000 at least. The first object 
of the present paper is to provide data gathered from a population 
of 1037 girls. 
| A correlation table for one thousand population serves at least 
two points of value. We examine the table by means of the Spearman 
tetrad technique. In the first place, the Spearman Theory of Two 
Additive Factors requires a fit in the tetrads to within sampling error 
value, and, instead of sampling error of amount 0.03 for one hundred 





* We express our deep indebtedness to Professor Spearman, under whom we, 
as Research Assistant, covered the present work. 
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populations, we should encounter sampling error of 0.0095 for one 
thousand population (with subtests of the kind used by us). Secondly, 
it is likely that any such Theory, dependent as it is upon the use of 
many subtests, involving scores of test items, and for data gathered 
under complex experimental conditions, will need to take account of 
errors other than sampling error. Particularly will this be so for 
large populations, for resulting small sampling error. It is obvious 
that as the large errors, such as that of sampling, are diminished, more 
and more smaller disturbances will become noticeable; we expect this 
in our statistical and psychological material, just as we do in physical 
experimentation. ‘Thus, the second object of our work is that of 
furthering the investigation of error in tetrads, other than sampling 
error. As in physics, these errors should receive consideration in 
detail, before correlational material is submitted to complicated factor 
patterns of the kind given by Professor Kelley.! 

If, from our correlation tables for one thousand population, the 
tetrad-differences show error in excess of sampling error, then we seek 
explanations of the excess, and finally may need to make it a subject 
of special investigation. If the excess error has a reasonable explana- 
tion, if it makes contact with expectations, if it can be controlled by 
other experimentation, then the Theory of Two Factors, built upon 
the tetrad criterion, is still fully acceptable. Indeed, wherever and 
to the extent that the excess error can be shown to be reasonably 
expected, the theory would not be fully acceptable unless such error 
did occur. Thus, the two objects set for the present paper—that 
concerning the g-factor for a battery of non-verbal subtests, for one 
thousand population, and that concerning error other than sampling 
error—really supplement each other. 

A few sources of excess error in tetrads have been suggested already 
in previous papers. Excessive likeness and accidental linkages between 
abilities, and heterogeneity of race, age, and sex, were early recognized 
by Spearman as explanations of excess error. Some amongst verbal 
subtests (attributed to similarity of the fundaments or relations 
involved) was first observed by Davey,? and later by myself” (attrib- 
uted to scholastic influences, in part when the population is drawn from 
different schools or districts). Professor Spearman® has noted the 
influence of different scales of scoring subtests, and, unless difficulties 
are obviated, estimates an error of 0.01 to 0.02 as likely to accrue for 
large populations if the subtests are variously scaled. ‘‘Speed” 
preferences undoubtedly are a further potent source of excess error 
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in tetrads. Idiosyncracies introduced into subtests when these are 
constructed by one psychologist may lead to error in the tetrads for 
these subtests; and excess error may issue from a propinquity 
effect, deriving from the order of application of a battery of subtests.” 
Again, there is possible an influence entailed in testing by groups.’ 
Finally, not least of the influences that may lead to excess error in 
tetrads, there are mistakes in calculational work, test marking, and 
application. All the above are the kinds of possible disturbances in 
tetrads, influences leading to error in excess of sampling error. Our 
object is to gather what information we can about these, and any, 
influences. Some may be of psychological significance as objective 
specificalities, 7.e., as a group factor or factors, others may be trivial 
effects that further experiments could obviate. 


EXPERIMENTAL MATERIAL 


The data to be examined are for a non-verbal group test of eight 
subtests, applied by myself to 1037 girls. Testing was by groups, of 
about fifty girls per group, in elementary schools.* Eleven schools, 
half the number in the region, were drawn from the centre of the city 
of Newcastle-on-Tyne (England). The schools, classes, numbers of 
girls per class, are shown in Table I. The average ages of the children 
in the classes is given at the foot of the Table. 

There were no extreme regional differences in social or economic 
status of the population of girls. The schools were much alike in 
point.of teaching conditions and attainments. The small area covered, 
comprising the centre and greater part of the city, and the localizing 
effect of the city’s administration, combined to give the population a 
distinct homogeneity in respect to influences of the kind just mentioned. 

Age distribution is given in Table II. Eight hundred eighty-seven 
girls were drawn from Standards V and VIb (roughly equal classes), 
all of age falling within the range 10 years to 12% years at the time 
of testing. These give to our work a large measure of homogeneity, 
so far as school can be compared with school, in both age and educa- 
tional attainments or influences. The various classes from which 
these girls came contained forty girls below ten years of age, and these 
were retained. To offset these young bright girls, we took fifty bright 
girls from Standards IVa, of age less than ten years, and a further 





* The author is indebted to the Committee and to Mr. Walling, Director of 
Education. 7 
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seventy-six (at random from the same IVa classes) of age 10 to 1214 
years. Thus, one hundred twenty-six girls were from Standards 
IVa, but a large part of these were of prescribed age range (10 to 1244 
years). Of the girls over 1214 years of age, seventy-five were in the 
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Number of girls in is 
Total Num- 
School per ber of 
Standard | Standard | Standard | Standard | , 34...) | stoups 
IVa Va and b VIb VIIb tested 
| | aig 
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MR 8 ea Sees tak 50 "AD Rpee RABE Cpe: 97 2 
rte» Mach Perea ea 34 53 1 88 2 
J 6 61 a Ac tilnaa eekokoaiains 94 2 
K 26 Metis By, Sob soebentieuaees iimee 91 2 
Totals per standard... 126 _ 587 300 24 
i i og on A Ai wknd pile bas @cig oes che scene ohikh oe 1037 
Average)'age per girl..| 10 yr. 2 mo. | 10 yr. 10 mo. } 12 yr. 1 mo. | 12 yr. 5 mo. 
Tasie [I].—Ace Distrrisvution ror 1037 Girits 
Aan 
(Yzuars, Monrus) FREQUENCY 
i Des Ree coe bd anh cies 5 
OU, oo oss oe thik eo eels ee ces cua ehes bbe ba cbbidee 21 
EN La 0, ote wa here e's eae 6 Romane a weal ee ee 64 
I a8 a dg matin saga ik eaclieaar ies 141 
ia ii oe ek ite ae a's ot wie way dee eien da ee 156 
EAPC ta AEE re) are NB BRL mT SON Foo a 206 
STI ss gk i ci he ted ee as bio ke eeseud e ay om 185 
PR A i eS te Sk se bit n wba Suh wee 169 
i ik a bhi a so 60 
Tee NL a al he eee wk eee 24 
IER SAREE IE RCI reg SO Me Pe pee OPT oe, Oe 6 


various Standards V and VIb. The girls entered from Standard 
VIIb of one school could be omitted without altering any of the results 
reported in the course of our work. 

The non-verbal group test was applied in June 1929. Each testing 
group received the verbal group test on one day, followed by the 
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non-verbal (our particular concern in the present paper) on the next 
day. Altogether, 1037 girls received both group tests. 

The Non-verbal Subtests—The group test had to be easily applicable 
in forty minutes full testing time. The subtests therefore had to be 
of simple construction, and we anticipated rather low intercorrelations 
for some of the subtests; on the other hand we expected that two of 
the subtests would be fairly highly g-saturated (No. III and No. V). 
The names of the subtests, in order of application to the girls, with the 
time allowance and number of test-units per subtest, were as follows: 


I. Alphabet Construction: 3 minutes, 10 test-units. 

II. Code: 2 minutes, 8 lines of code, scored per half-line. 
III. Fitting Shapes: 5 minutes, 15 test-units. 

IV. Picture Completion: 3 minutes, 15 test-units. 

V. Analogies, Form: 344 minutes, 16 test-units. 

VI. Counting Cubes: 2% minutes, 15 test-units. 

VII. XO Series Completion: 3 minutes, 15 test-units. 
VIII. Overlapping Shapes: 3% minutes, 24 test-units. 


Throughout our work, we shall refer to these subtests by the Roman 
numerals I, II, etc. A brief description of each subtest follows: 


In subtest I the testee was given a capital letter, and in imagination a piece had 
to be cut away, leaving as remainder any other capital letter. If R was given, 
P could be the required response. I’s were debarred; the given letter could be 
unusually orientated (as A thought of as VY); and one, two, or more pieces could 
be cut away (in imagination). The given letters were of simple printing, three- 
fourths inch high. A sample test-unit is shown in Fig. 1, in which three different 
responses are expected. The capitals used were, in order, X, U, A, H, E, 8, Q, 
P, W, B, and the number of responses expected in each were 1, 1, 1, 2, 3, 2, 3, 5, 3, 
and 8, respectively. In marking we allowed one mark each for test-units 1, 2, 3, 
and 6, and two marks for the rest, Nos. 4, 5, 7, 8,9, and 10. In the latter test- 
units one mark was allowed for half or more correct responses. In explaining the 
subtest, which appears to be readily understood, use was made of sample capitals 
Y, D, and M, providing V, L and C, and V and N, respectively. The samples 
were worked through on a black-board, following a standard procedure. 

Subtest II was the Code of the National Intelligence Group Test, a sample of a 
similar kind of subtest being worked through on the black-board. 

Subtest III can be understood by referring to Fig. 2. Three small shapes are 
given (together, to the left), which, when properly fitted together, form the larger 
shape (on the right). Lines had to be drawn in the larger shape (right) to indicate 
how it could be cut to give the three small shapes. A correct response required 
the drawing of no more than two or three lines in any test-unit. Four sample 
test-units were worked through on the black-board. 

Subtest IV was of the usual picture completion type; for example, a cup may 


be given which is minus its handle, and the testee is required to indicate the 
missing part. 
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Subtest V, an Analogies subtest, made use of shapes and lines in various 
relations one to another. A sample test-unit is given in Fig. 3. The task of 
explaining the subtest’s requirements was rendered easy by referring the girls to g 
verbal analogies subtest given to them on the previous day. 












































Subtest VI was a modified form of the American Army subtest of this name. 
Subtest VII, a series completion subtest, can be understood by referring to 
the sample below: 
XXOXXOXXOXxXO———— 


‘B32: oBf@ago 
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Figure 4. 


The four dashes had to be filled in so as to continue the series, ‘‘X X O X”’ being 


the required responses in this sample. 


Subtest VIII required a little more preliminary explanation than was needed 
for most other subtests. Figure 4 shows the type of test-unit. All the shapes 
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overlapping the 1 had to be indicated by cancelling the appropriate shapes adjacent 
to No. 1 on the right. A large square, a long rectangle, a long thin triangle, a 
circle, and a “‘five-sided”’ figure (regular pentagon), were the shapes made use of in 
each test-unit. In the case of the sample shown in Fig. 4, the square, triangle, 
and circle, required cancelling. 


The subtests have been described for adult understanding, but the 
testing directions were much more childlike and concrete. Applica- 
tion was standardised to an easy facility, and the subtests were given 
with this regularity throughout. Each subtest was prefaced by a 
fore-practice of six or so test-units, worked through on the class 
black-board; altogether fifteen minutes of the forty minutes testing 
time was so spent in giving directions and working the sample test- 
units. For each subtest the girls were allowed about fifteen seconds 
to settle to be ready for the command ‘‘Go,” each girl fingering the 
page ready for turning (cyclostyled sheets, stapled in the top left- 
hand corner, are quickly turned over, leaving the subtest exposed), 
and were brought to a stop with an energetic “Stop” command. I 
had group-tested many hundreds of boys and girls previously. 

Subtests I, II, 1V,and VII, occupied one sheet of paper each (quarto 
size), while the rest required two pages each; but in the latter cases 
the two pages of test-units faced each other, and appeared to the girls 
as one double-sized page in each case. There was, therefore, no page- 
turning during the time allotted for working these subtests. 

Enquiries elicited that no group tests had been applied to these 
girls previously. 


INTERCORRELATIONS AND TETRAD-DIFFERENCES 


Intercorrelations for the eight non-verbal subtests were first cal- 
culated for crude scores, using the formula for differences (Kelley 
p. 180). All calculations were checked in various ways, but the ques- 
tion of mistakes receives attention later. Table III gives the product- 
moment correlations for the eight non-verbal subtests, with age, for 
1037 girls. These we are to consider in terms of the Spearman Theory 
of Two Additive Factors. 

When we are concerned with tetrad-differences for a table of inter- 
correlations, we calculate one-half the number of tetrad-differences 
possible for the table (the other half are identical in value, but of 
opposite sign). The mean of this half-number of tetrad-differences 


is thus the average or mean deviation about the mean of the observed 
differences. 
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For the theoretical PE of the tetrad-differences we use formula 
16A (Spearman,’ p. xi). For exact comparison with this theoretica] 
PE we calculate the observed probable error, given by 0.6745+/ >t/2n, 
where ¢ stands for ‘‘tetrad-difference,”’ and n is the number of tetrad- 
differences. 

If “normal” probability distribution obtains for the observed 
tetrad-differences, the above mean deviation about the mean of the 
observed tetrad-differences, multiplied by 0.8453, gives the value of 
the probable error for these observed tetrad-differences We antici- 
pate that this latter value should be approximately the same as that 
given by the sigma above. 

For any set of tetrad-differences, then, we can give (and it is our 
usual practice to do so throughout our work) the following values: 


(a) The mean of the half-number of tetrad-differences, sign being disregarded. 
This is the mean deviation about the mean, or average, of the full set of tetrad- 
differences, when regard is paid to sign, the average of the full set being zero. 

(b) The probable error of the differences, calculated from the above mean, 
i.e., from the mean deviation, i.e., 


pe = 0.8453 X Mean deviation 


(c) The probable error of the differences, calculated from the observed sigma 
of the differences, i.e., 


= 0.67454/ 22 
pe Va 

(d) The theoretical probable error, always expressed as PE, given by equation 
16A (Spearman). 


From Table III, before age is partialled-out, we obtain the following 
results for the tetrad-differences: 


(a) Mean of 210 tetrad-differences.....................5- 0.0232 
(b) Observed pe, from mean X 0.8453................044. 0.0196 
Lok wp seb bb ieee da bap yee eees 0.0095 


If we partial out the age correlations, the new intercorrelations for 
the eight non-verbal subtests give the following results: 


(a) Mean of 210 tetrad-differences, age partialled-out corre- 


Con tduws pueeeect cums cd gab etek eek seme ced 0.02265 
(b) Observed probable error, from 0.8453 X the above mean 0.0191 
(c) Observed probable error, from 0.6745 X sigma........ 0.0184 
ie SE eS no on win nd caesdccciceen ss avd Rae eee 0.0094 


Thus, as a whole, the intercorrelations show error in excess of that 
expected as sampling error, the excess being of the order 0.016. Fur- 
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ther, age has no significant influence on the tetrads here considered. 
Our immediate problem, then, is the attempt at location of this residual 
excess error. 


LOCATION OF THE Excess ERROR 


When the tetrad-differences show excess error it is our first object 
to try to explain the excess, in terms of specificality; 7.e., of a group 
factor or factors, or in terms of extraneous influences, such as calcula- 
tion mistakes. Some possible specificalities and influences have been 
described in the Introduction. 

Without going into details, it may be said here that there is no 
evidence that any of the influences mentioned above (with exceptions 
to be considered below) have singly produced the residual error. 
Thus, the effect of testing by groups was found by me to be here with- 
out influence on the tetrads. Again, subtests III, V, VI, and VIII 
have somewhat similar shapes as fundaments, but there is no evidence 
of a broad gross specificality because of these fundaments. Subtests 
III and VIII involve very similar geometrical figures as fundaments, 
but specificality is not shown by 7y,ym, We guarded against ‘‘pro- 
pinquity”’ influences in our testing procedure, although we would not 
be quite free from an habituation disturbance for 7, or T11 
From the way in which the interesting subtests (judged from the girls’ 
attitudes towards the tests) were introduced in the battery, and from 
the short testing time entailed, we can neglect possibilities of fatigue, 
subjective or otherwise, entering our material. No matter what we 
attempt, we can not isolate any single correlation and submit that it is 
associated with the larger, or greater part, of the excess error shown 
by the tetrads. Thus, in a preliminary search for possible disturb- 
ances there remained two for consideration; first, the possible calcula- 
tion mistakes; second, the possible error introduced by dissimilar 
scales, by faults in score distributions. 

If, by recalculating intercorrelations for new scales (a new dis- 
tribution of scores for each subtest, without distorting the sense of the 
scores), we finally rid the tetrad-differences of excess error, the whole 
procedure will be open to the criticism that the intercorrelation tables 
vary slightly one from another, due to calculation mistakes, so that 
the final removal of the excess error would be considered to be merely a 
happy chance in the calculations. But, at least, we shall see the extent 
within which such variations due to calculation mistakes can be 
taken to be of influence in our work; and, should our procedure show 
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orderly diminution of the observed tetrad-differences, some evidence 
will have been obtained for a fair measure of soundness in the calcula- 
tions, and for the supposed influence of the irregular distributions 
of scores. 

The Subtests Rescaled—We proceeded in two stages. First it 
was thought perhaps sufficient to rescale the subtests so as to give 
approximate ‘‘normal”’ distributions of scores for the 1037 girls. 
When it was found that this approximation failed to give the expected 
improvement in the error shown by tetrads, we subsequently converted 
all crude scores into a “‘standard”’ normal distribution, the same for 
each subtest. Using new correlations for seven of our subtests, each 
subtest having an approximate ‘‘normal”’ score distribution, gave a 
mean of 105 tetrad-differences of amount 0.0229; this is to be com- 


Tas.Le III].—PrRopvuct-MOMENT CORRELATIONS FOR N oF 1037. Crupeg Scores 
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Age I II III IV V VI | VII | VIII 
Age 1783 | 1066 | 0550 | 0866 | 0039 | 0443 | 0152 | 0813 
I 3565 | 4181 | 3227 | 3816 | 2254 | 2964 | 3376 
Il .... | 8275 | 3155 | 3606 | 1669 | 3083 | 3802 
III .... | 8998 | 4759 | 3663 | 3473 | 3678 
IV .... | 38810 | 2671 | 2424 | 3152 
V .... | 2777 | 4045 | 3098 
VI weve were es eye ore eee err es eee © | Uk 
Vil ee ee oer eres eos: eee eoern rrr e eC 
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pared with the value 0.0234 observed for the same seven subtests 
with the intercorrelations of Table III. The average of the twenty- 
one intercorrelations for the new table was 0.3598; the corresponding 
average for Table III is 0.3614 (subtest VI omitted). Thus the 
rescaling to an approximate “normal” distribution (amounting to 
removal of any ‘‘skewness”’), does not rid us of the excess error in the 
tetrads. Nevertheless it perceptibly altered the intercorrelations, 
in the direction of a smoothing-up of the table of intercorrelations, 
giving better “hierarchy.” It appeared that just another such slight 
smoothing would bring the tetrad-differences to within sampling 
error values. 

x It is not expected that crude scores should have much more exact 
“normal” distributions of scores than those obtaining for our sub- 
tests. But the number of girls given zero score in particular subtests 
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is frequently disproportionately large and, when the test-units of a 
subtest are not sufficiently smoothly graded, the short time-allowance 
per subtest tends to accentuate the frequency of certain scores. Dis- 
turbances of this kind may hide the general results that are our 
concern. To take into consideration influences of this kind we con- 
sidered it worth while trying our data with new scales, this time fitting 
each subtest to a more exact ‘‘normal”’ distribution, the same for each 
subtest. 

For convenience the six subtests I, II, III, IV, V, and VIII were 
made the subject of the new conversion. The crude scores which 
supplied Table III were now converted to fit exactly the following 
approximate “‘normal”’ distribution: 





2| 3 4 5) 6 7 8 9 10/1112 13)14 15 16 17 
PR ovis ccceccess | 3 EERE 24 48 70 98)118 135 134 118,98/70/48 (24 16 7 3 





The method used in converting the crude scores will be understood 
by referring to the following example of the procedure. 
The crude scores and frequencies for subtest I were as follows: 





Crude scores............. | 0 | 
sd os San bo eee 4 | 2 








Of the ten girls obtaining crude score 1, one was chosen at random, 
and her score was reduced to zero [making, with the two having zero 
crude score, the three new-scale 0 scores required for the ‘‘normal”’ 
distribution (for convenience we name this distribution the “‘standard”’ 
one)|: seven of the girls retained their score 1 (now taken as new-scale), 
and the other two received new-scale score 2. We now require four- 
teen more new-scale 2’s: these were taken at random from amongst 
the twenty-seven having crude score 2, leaving thirteen. The thirteen 
are accredited with a score 3 on the new-scale. Similarly eleven of the 
thirty-two crude score 3’s are given new-scale 3’s, and the rest are given 
new-scale 4’s. This process is repeated up to the maximum score, 
17 new-scale, all the selections within a particular crude-score being 
made at random. It is obvious that the new scaling is reasonable 
in that it does not distort the sense of the subtests; and it is: valid, 
having in mind the arbitrary system of allotting crude scores. 
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The six subtests so rescaled provide the intercorrelations given in 
Table V, age being neglected. (It seems that age influence, for al] 
our tables of correlations, would introduce at most about 0.005 error 
in the tetrads. See in this connection Holzinger,‘ p. 28.) The sub- 
tests give tetrad values as follows: 


(a) Mean of 45 tetrad-differences for Table V.......... 0.0188 
(b) Observed probable error, given by mean X 0.8453... 0.0159 
(c) Observed pe, given by sigma X 0.6745............. 0.0171 (6) 
SPN dost a ta ws auldce sh done eiveseeosesew ss teks 0.0096 


These values show improvement when compared with corresponding 
tetrads for Table III. But the improvement is greater than the 
above observed probable error evinces. One correlation, ry yj, 
has associated with it the larger of the tetrad-differences. If we omit 
the twelve tetrad-differences involving this correlation, we are left 
with the following: 


(a2) Mean of 33 tetrad-differences for Table V, omitting 


Ty,VvIII rr Teer Te eT eee eS ET eS Cee PT 0.0104 
(b) Observed probable error, given by mean X 0.8453... 0.0088 ' 
(c) Observed pe given by sigma X 0.6745.............. 0.0080 a) 
PC ei tictwswhv nah ask babne sua diadenksteibwramed 0.0095 


The observed probable error (0.0088) is now attributable solely to 
sampling error. It seems that the improved distribution of subtest 
scores has removed error in the tetrads, other than sampling, and that 
due to rym, a8 we see from a comparison of the values at (a) with 
corresponding tetrads for Table III. Thus the forty-five comparable 
tetrads for Table III (crude scores) have the following values: 


(a) Mean, forty-five tetrad-differences for Table III....... 0.0244 
(b) Observed pe (mean X 0.8453)... ..............00000.- 0.0206 
GARE CAGS So's SEHD GLa bd 05 ou ik WORM CH EES oe Can et 0.0100 


Again, if we omit from these forty-five tetrads the twelve that involve 
the correlation ryyn:, we are left with the following values, for com- 
parison with the observed probable error of amount 0.0080 for Table V: 


(a) Mean of thirty-three tetrad-differences, for Table III, 
subtests I, II, III, IV, V, and VIII, omitting ry, yuy.... 0.0191 
(b) Observed probable error, given by mean X 0.8453... .. 0.0159 
(c) Observed probable error, given by sigma X 0.6745... .. 0.0169 
Se NY 65% Wil ced nd Wi aed ro 49% » ae ee eawe dws oes 0.0098 
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Values quite similar to these for Table III were obtained also for 
the table of intercorrelations alluded to at the beginning of this section, 
where seven subtests were rescaled to an approximate ‘‘normal’’ 
distribution of scores. The value 0.0169 at (c) above is for crude 
score correlations, whilst 0.0080 at (a (c)) is for “‘standard”’ normal 
probability distributions of subtest scores; we conclude that the 
improved distribution of scores has removed excess error in the tetrads 
(excepting 7;;yu1). The influence of the crude scores is likely to be 


of amount 0.0149 (given by +/0.0169? — 0.00802). This, we note, is of 


amount estimated by Professor Spearman as likely to accrue for large 
populations if subtests are variously scaled as mentioned in the 
Introduction to this article. 





TaBLE 1V.—MeEans AND SIGMAS OF THE NON-VERBAL SuBTEsTs, FoR N = 1037. 











GIRLS 

Subtest Mean Sigma Maximum score 
I 6.89 2.32 15 
II 7.01 2.22 15 
III 4.48 1.87 15 
IV 4.64 1.88 10 
V 7.07 3.00 16 
VI 3.81 2.47 13 
Vil 7.33 1.90 13 
VIII | 8.07 2.69 17 











We conclude, then, that the observed probable error given at (a) 
above for Table V, is as low as we can reasonably work for, and the 
tetrads provide sampling error differences only; to complete this 
conclusion, however, we must explain the excess error’ apparently 
shown by the correlation 7, yy. Further, it appears that the rescal- 
ing, to fit a “‘standard”’ ‘‘normal”’ distribution of scores, has effected 
the improvement in the error shown by the tetrads. In this connec- 
tion we should note the following contributory evidence: The means 
of the corresponding intercorrelations in Tables III, that alluded to 
for seven subtests with approximate ‘‘normal”’ distributions, and V, 
are 0.3614, 0.3598, and 0.3527 respectively—these values, we suggest, 
offer an example of the fair quality of the calculations, and this 
receives confirmation in other directions, for instance, from the 
objective specificality (considered below) for ryym. We can not 
readily consider that a happy chance has resulted in the improvement 
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shown in the observed error at (a). Calculation errors there are, no 
doubt, but they are too fine to be significant in our tetrads. 

To complete our account of the factor content of the non-verbal 
subtests we require to explain the excess error apparently shown 
by Tivuut- 

Consideration of the ryyvi1 Specificality.—One would not place too 
much regard upon a single case of divergence from sampling error 
values, and usually the details we are about to give would scarcely be 
necessary. But, the excess error attributed to ry, yy; can receive a 
simple explanation, and the specificality can be shown to be objective: 
it is a matter of importance to report the data concerned, for reasons 
connected with the acceptability of the Two Factor Theory for our 
data, for the contact it makes with the investigation of error other 
than sampling error, and for the evidence provided of the objectivity 
of the correlational values under consideration. 

TABLE V.—PRODUCT-MOMENT CORRELATIONS: NON-VERBAL SUBTESTS, WITH 


Scores CONVERTED TO A ‘‘STANDARD’’ NORMAL PROBABILITY DISTRIBUTION. 
N = 1037 Girts 

















Subtest | I | I yr er VIII 
| 

I 3463 | 4124 3274 3733 3207 

Il slsig 3484 2974 3424 3797 

Ill a Lap 4140 4711 3508 

he ae Ae Se 3771 3012 

Sars ee eae eas 3278 
VIII | | 














We have suggested that 7,; yn: is the source of excess error in Table 
V. Wedoso by noting that the largest tetrad-differences for Table V 
uniformly involve this correlation. Tetrads of the following type 
are then isolated, and'omitted from the main body of tetrads: 
(rap *Tzy) — (Taz* Tay) =f (1) 
(rap*Tzy) — (Tay’ Taz) =f" 
(where rs, is suspected for excess error, and where zx and y are any 
other subtests, taken two at a time, regard being taken of the sign of 
the difference). The (twelve tetrads of this form for rag = Tyyu» 
for Table V, have mean of amount +0.0420. 
Now, there is evidence from other sources that strengthens the 
claim of the 7,;ym excess error to a fair degree of objectivity. The 
evidence is as follows: 
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In the table for crude scores, Table III, 7; y:;; has the largest mean 
associated with it, for tetrads of form (1), namely an amount +0.035. 
Again, correlations worked for a sub-population of two hundred girls, 
a selected group with ages ranging 10-9 years to 11-3 years at the time 
of experiment, gave the following results: 


(a) Mean of two hundred ten tetrad-differences, sub-popu- 


lation of two hundred, eight non-verbal subtests... .... 0.0319 
(b) Observed probable error, given by mean X 0.8453..... 0.0270 
ER I AR (EP CR ae I. ag rene i Sere eee pe 0.0230 


There are thirty tetrads of form (1) for ry.y in this two hundred 
ten tetrads, with mean of value +0.0737; the omission of these thirty 
from the full table of tetrad-differences leaves the following results: 


(a) Mean of one hundred eighty tetrad-differences, sub- 


population of two hundred girls...................... 0.0248 
rr Sheer or awit tuuisb bce de bene esa 0.0210 
ee ie ed ie ia ole ete eed 0.0230 


Other correlation tables, for sub-populations of one hundred girls, and 
five hundred girls, gave similar results. The most noticeably large 
tetrad-differences always involved the correlation ry, y;, and the 
omission of this correlation left tetrads with sampling error values 
only. The result shown for Table V, then, is not particular to that 
table of correlations. The excess error associated with ryym is 
independent of the scoring methods used in our work; and the obser- 
vation of its influence is undoubtedly repeatable, 7.e., objective. 

An acceptable specificality between the two subtests II and VIII 
can be explained in terms of a ‘‘quantity-work,” or “‘speed,” prefer- 
ence. The Code subtest (II) may be liable to a pronounced prefer- 
ential attack for ‘‘speed” as against “‘quality.”” Subtest VIII would 
seem to entail a similar preference on the part of the testees. One girl 
may diligently ensure that each test-unit in subtest VIII is completely 
and surely answered, all the overlapping shapes being sought for. 
Another girl may be satisfied if one or two overlapping shapes are 
found, and would spend no time in a search for further shapes. The 
former may respond to five test-units only, marking probably fifteen 
figures that overlap, without a single error or omission; the latter may 
respond to all twenty-four test-units, marking forty overlapping figures, 
but with not a single test-unit correctly answered. A differential 
influence of this kind has been observed frequently in other work, in 
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researches made in our laboratory. With our present material, we 
have a test of the “explanation”; and we next give this attention. 
Subtest VIII was scored for quality (Q, completeness of answer), 
and for quantity (q, total number of shapes correctly indicated, cor- 
rected for chance correct responses). The scores were added, giving 
equal weight to each, and (Q + q) was then the crude score for subtest 
VIII, used in all work described in the previous sections. The sep- 
arate Q and q scores, however, give us the following additional data: 


(1) The correlations ry, and ry,9 are 0.4088 and 0.3217 respectively. 

(2) rgq is 0.8505. 

(3) Using the Q score only, the following new correlations were calculated, all 
scores being set on the “‘standard”’ normal distribution excepting Q: 


9,1) 79,1 7@,1 79,1v) Te,v, having values 


0.3230, 0.3217, 0.3216, 0.2652, and 0.3232, respectively. Replacing the 
correlations for subtest VIII in Table V by the corresponding ones given 
above, gives a new set of tetrad-differences, with value as follows: 


(a) Mean of 45 tetrad-differences, Table V, having Q in- 
IS oie Diba fat. has ssh sas anee hme was 0.0152 
(b) Observed probable error, given by mean X 0.8453.... 0.0128 
(c) Observed probable error, given by sigma, X 0.6745... 0.0128 
gg RAE APC eRe eee eer Gen, SE EO a 0.0095 


(y) 


The twelve tetrads of form (1) for the correlation ry,9 have mean of value 
0.0251. The above values should be compared with the values given at (8) on 
page 178; and the mean 0.0251 likewise should be compared with the mean of 
0.0420 given on page 180, for the tetrads of form (1) involving ry.q and ryyn 
respectively. The use of Q gives lower error in the tetrads. 

(4) With the correlations already provided, it is possible to calculate the 
following correlations: 

Tre TI, "II, TIV,e Tv,¢ 


These have values 0.2940, 0.4088, 0.3519, 0.3142, 0.3074, respectively. As at 
(3) given, we replace the correlations for the subtest VIII in Table V by these ¢ 
correlations. The resulting table of correlations provides forty-five tetrad- 
differences, with value much the same as those given above at (3) for the Q corre- 
lations, except that the twelve tetrads of form (1) have mean +0.0545. This 
value should be compared with the 0.0251 for the Q-measure tetrads of form (1), 
and with the (Q + q) measure value of 0.0420. 


Thus, it seems, from the considerations we have given of the matter 
in the above paragraphs, that the specificality shown by ryym has 4 
fair degree of satisfactoriness, owing most to a common influence 
between subtest II and the gq (quantity) measure of the subtest VIII. 
The excess error for the tetrad-differences of Table V, which we asso- 


m—we rh —" — mmr _~ =_ 


— Dm ss -, Me  - 


—- ©m st A & 








Tetrad-differences for Subtests 183 


ciated with ryyn, is shown above to be repeatable and therefore 
objective (within the limits of our data for the 1037 girls); similar 
error has been found in other work, in which a similar explanation in 
terms of a “speed” effect has beer offered; the explanation seems 
reasonable for the two subtests concerned; and it receives experimental 
support from the Q and q correlations. Thus, the r,,y,, specificality 
is well authenticated. " 

Further, this ‘‘speed”’ specificality seems to be the only one that 
can be isolated for Table V. Relevant to this conclusion we should 
add the following details. 

One may envisage a complicated cancelling, in the tetrads for 
Table V, whereby the correlations ryy4, Tyr) Tuy, 220 Trym, cancel 
out (except for the latter) any influence that each may have separ- 
ately towards excess error. If we omit all tetrads involving these 
four correlations, we are left with only eleven of the forty-five tetrads 
for Table V, with mean of amount 0.0083, observed probable error 
of 0.0069, theoretical PE of 0.0094. But we are not justified in 
suggesting the isolation of correlations that already give only samp- 
ling error value to tetrads. However, so far we have been critically 
concerned with only six of our eight non-verbal subtests; if need be, 
we could widen our data by including the two subtests so far not 
rescaled, thereby perhaps obtaining further information about the 
effect of any, or all, of the four correlations considered above. There 
is reason to suppose that if subtests VI and VII were rescaled, they, 
too, would fall into line with the values and characteristics shown by 
Table V; thus, subtest VI was included in the correlation table alluded 
to in this article under ‘‘The Subtests Rescaled”’ (for the seven sub- 
tests with approximate “normal” distribution of scores), and it fits 
the tetrads as well as any other subtests; and, further, subtests VI and 
VII had clearly the most irregular crude score distributions, which 
amply explain the correlation anomalies they show in Table III. 
Until we find clear need to seek further information about the above- 
mentioned correlations, or any others, we propose being satisfied with 
the data obtained already for the six subtests in Table V. 

The correlation 7,,, requires a moment’s consideration because 
of contact made with it in the course of work to be described in a 
subsequent paper. We may state, without going into details (a 
matter that can be examined by anyone so interested, since all the 
necessary data are provided in this paper), that ry,, is in no way 
singular in its effects in tetrads. No matter from what angle we exam- 
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ine our data, we obtain no evidence in support of specificality for 
Tyry- Parenthetically we should add that in any detailed examina- 
tion it should be remembered that the Q measure for subtest VIII is 
biased for quality (with the narrow conception given to this word, 
namely, ‘“‘completeness of answer’’), and this slightly disturbs corre- 
lations with Q. Any need to consider ry, y (Or Tym, Ty ete.) as a 
possible source of excess error, is removed if the Q or q scores are 
neglected. 

We conclude that 7, y1;, in Table V, alone involves acceptable 
specificality. 


R&suME AND CONCLUSION 


Commencing with the intercorrelations of Table III, which provide 
an observed probable error of 0.0196 for their tetrad-differences, we 
have endeavored to locate error of amount about 0.016 in excess of 
that expected for sampling error (0.0095). If we were able to explain 
the excess error satisfactorily, then we are left to conclude that the 
intercorrelations for the non-verbal subtests have good agreement 
with the Theory of Two Additive Factors. 

Of first importance is the question of our calculation mistakes. 
The sequence of error for tetrads for Table III, that alluded to for the 
seven subtests with only approximate “normal” score distributions, 
and for Table V, as well as the play of error for the various influences 
considered in the previous section are the evidence that we offer of the 
quality of our correlational calculations. We cannot hope to be free 
entirely of calculational error; but, we add, the correlation calculations 
should be accepted even if our tetrad calculations are found slightly 
inaccurate, because the correlations have received the greater atten- 
tion. Further, our data are lodged permanently in the Psychological 
Laboratory, University College, London, and are available for anyone 
to rework. 

Neglecting possible calculation mistakes, and age influence (which 
is no doubt present, but introduces very little error, of amount 0.005 
at most), we account for the excess error of Table III tetrads as 
follows: 

First, there is error introduced when the subtest scores are not 
distributed satisfactorily ‘‘normally.””. When the crude scores are 
converted to fit a “‘standard” normal probability distribution, the 
excess error is removed, except for a single specificality that can be 
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isolated and explained. Neglecting this single specificality gives an 
observed probable error of 0.0081, for a theoretical PE of 0.0095. 

We show that this single specificality, for ry,ym1, is objective, and 
acceptable as the only specificality among our non-verbal subtests. 

The Theory of Two Additive Factors therefore fits well with the 
observed results for our non-verbal subtests, allowing for ry,,y1n;. We 
explain the intercorrelations in terms of a common g-factor, and factors 
specific to each subtest (with specificality for ry,yn1). The g-factor 
we take to be that first observed by Professor Spearman, the factor 
that appears to characterize relation and correlate eduction. Thus, 
the g-factor hitherto observed for non-verbal subtests for small 
populations, is here shown to obtain for one thousand population. 

Finally, concerning our second object, that of knowledge of error 
other than sampling error, we can here say that the facts can best 
be given on completion of our work, with the verbal subtests considered 
in addition to the non-verbal. At this juncture we note that, accept- 
ing our correlations, it is possible to observe the influence of non- 
normal distribution of scores for the subtests. Our second point of 
note is that we isolate a single specificality, attributed to a “‘speed pref- 
erence.” Such a “‘speed” effect must be looked for in all work with 
g-tests; and the influence must be kept in mind particularly in our work 
with the verbal subtests. 
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A COMPARISON OF WHITE AND NEGRO CHILDREN: 
NORMS ON MIRROR-DRAWING FOR NEGRO 
CHILDREN BY AGE AND SEX 


R. J. CLINTON © 
Oregon State College 


A great deal has been written and said by way of comparing the 
white and Negro races, both in mental ability and motor ability. 
Ragsdale! said that there was not much difference in the ability of 
white and Negro boys, in the rate of tapping, ‘‘although there seems 
to be a slight tendency for white boys to be superior at the later ages.” 
He found that there was not much difference in the rate of tapping 
of white and Negro girls until the ninth year when “white girls are 
consistently superior in tapping rate.”” Waltner? states: ‘“‘This study 
of the learning capacity of Negroes as compared with whites finds the 
power to form sensory-motor coordinations by Negroes to be about 
72.8 per cent that of whites.” 

Bardin® observed that the anatomy and physiology of the Negro’s 
brain hindered him in learning. Jordon’s‘ personal observation was 
that the learning capacity of the Negro could be determined by the 
darkness of his skin. Phillips’ suggested retardation as a criterion 
for judging the ability of Negro children. 

Schwegler and Winn‘ studied a group of 58 Negro boys and girls 
and a like number of white boys and girls in the Junior High School 
of Lawrence, Kansas. The selection was by chance. They state: 


There seems to be an unmistakable difference in the intellectual life of the two 
groups studied. The median intellectual endowment of the colored group is 
about eight-five per cent of that of the white group. 





1 Ragsdale, C. E.: ‘Psychology of the Negro.” Thesis, University of Missouri, 
p. 144. 

2? Waltner, Erma: ‘‘Psychology of Negro.” Thesis, University of Missouri, 
p. 36. 

’ Bardin, J.: Factors in the Southern Race Problem. P. Sc. M., 1913, Vol. 
LXXXIII, pp. 368-374. 

‘Jordon, H. E.: The Biological and Sociological Worth of the Mulatto. 
P. Sc. M., Vol. LX XXIII, pp. 573-582. 

5 Phillips, B. A.: Retention in Elementary School of Philosophy. Psycho- 
logical Clinic, Vol. VI, p. 79. 

6‘Schwegler, R. A., and Winn, Edith: Comparison Study of White and Colored 
Children. Journal Educational Research, Vol. II, pp. 838-847. 
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The writer became interested in the studies cited and desired 
to make a comparison of the abilities of white and Negro children! 
in mirrow-drawing, motor speed as shown by marking, and letter- 
forming speed as shown by writing, as well as to make some compari- 
sons of the mental abilities of the two races. 

By means of the use of the Otis Self-administering Test of Mental 
Ability, the writer made a comparison of the mental abilities of an 
unselected group of white high school pupils, and an unselected group 
of Negro high school pupils. The one hundred fifty-five white 
high school pupils gave a mean IQ of 100.5, and the one hundred 
twenty-two Negro high school pupils gave a mean IQ of 84.5. 


’ 
METHOD OF THE STUDY 


The mirror-drawing equipment? consisted of a stationary mirror 
on a drawing board, and a small shield to prevent the subject from 
looking directly at the hand while working. The mirror-drawing 
pattern consisted of a two-circle pattern with numbers from 1 to 24. 
Numbers 2, 5, 8, 11, 14, 17, 20, 23 make up the small inner circle, 
and numbers 1, 3, 4, 6, 7, 9, 10, 12, 13, 15, 16, 18, 19, 21, 22, and 24 


TaBLE I.—CoMPARISON OF WuITE Bors AND NgeGro Boys From Srx To SEVENTEEN 
Years oF AGE IN MIRROR-DRAWING 














White boys } Negro boys 
Age Number ae | | Age Number pea 
6 38 om | 4 19 1.7 
7 35 6.6 | 7 21 2.9 
8 49 GES Seer” ees fame, 8.2 
9 27 ak * |. 6.8 
10 43 10.5 | 10 | 22 7.2 
11 30 oe" s oe | 21 9.7 
12 40 as ft es 28 8.3 
13 32 15.8 | 18 28 8.2 
14 44 . 20.1 14 10 16.2 
15 34 22.4 15 23 10.8 
16 46 24.9 16 16 9.3 
17 49 30.8 || 17 10 16.5 




















1 The writer included 591 Negro children in this investigation. 
? Clinton, R. J.: Nature of Mirror-drawing Ability: Norms on Mirror-drawing 
for White Children by Age and Sex. Journal Educational Psychology, Mar., 1930. 
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make up the large outer circle. 
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The subjects were instructed to uraw 
a continuous line from 1 to 24, cutting the small dot at each number. 


TasBLE II].—CompaRiIson OF WHITE GIRLS AND NEGRO GIRLS FROM Six To 
SEVENTEEN IN MIRROR-DRAWING 











a 9 7 oe ae ee 
Ree OR eee 








White girls Negro girls 
| 
Number peor | Age Number Mirror- 
rawing || drawing 
28 2.9 6 17 me 
41 4.9 7 18 4.9 
53 5.9 8 24 3.7 
28 8.3 9 32 4.6 
44 8.9 10 37 6.9 
49 10.9 11 29 5.3 
49 13.8 12 28 7.2 
37 16.1 13 45 8.1 
57 22.6 14 18 7.7 
61 30.7 15 30 9.2 
48 34.4 16 24 23.1 
49 38.6 17 15 26.1 




















TaBLeE III].—ComparRiIson OF Wuite Boys anp Nearo. Boys From Six To 
SEVENTEEN IN MARKING SPEED AND LETTER MAKING SPEED 
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White boys Negro boys 
Number; Marks | Letters Age |Number; Marks | Letters 
38 119 54 6 19 98 45 
35 124 74 7 21 113 71 
49 153 123 8 25 153 84 
27 160 150 9 28 169 118 
43 195 179 10 ' 22 167 141 
30 203 191 11 21 192 146 
40 223 200 12 28 216 160 
32 243 228 13 28 222 183 
44 252 234 14 10 229 197 
34 264 241 15 23 254 209 
46 271 250 16 16 263 234 
49 285 268 17 10 248 236 
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The score was the number of lines completed in a five minute period. 
The writer designed a mirror-drawing pattern to use in Grades I, I], 
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and'i{I Which included the figures from 1 to 15. The figures were 
so placed that the small children had no difficulty in locating them. 


TasLE I1V.—Comparison oF WuitTe Girits AND Necro GIRLS FROM SIx TO 
SEVENTEEN IN MARKING SPEED AND LETTER MAKING 














White girls | Negro girls 
|| ) ~ 
Age | Number; Marks | Letters || Age | Number| Marks | Letters 
| 
6 28 112 52 6 | 17 102 66 
7 41 147 96 7 18 136 105 
8 53 164 137 8 (24 161 | 122 
9 28 193 176 9 32 175 | 138 
10 44 212 194 10 37 202 163 
11 49 217 207 11 29 208 156 
12 49 228 | 227 12 28 231 199 
13 37 | 249 250 13 45 223 206 
14 57 263 268 14 18 236 230 
15 61 270 272 15 30 238 231 
16 48 271 274 16 24 264 250 
17 49 274 280 17 15 265 251 





























Taste V.—Norms ror Necro Boys AnD Necro GIRLS IN MIRROR-DRAWING, 
- MARKING, AND LETTER MAKING 























Negro boys | Negro girls 
—— 
| 
i | j - 
Age pened Marks | Letters || Age nee Marks | Letters 

rawing | drawing 
6 1.7 98 45 | 6 a 102 66 
7 2.9 ee ee Se es 4.9 136 105 
8 8.2 153: | S | 8 3.7 | 161 122 
9 6.8 169 | #118 | 9 4.6 | 175 138 
10 7.2 167 | 141 ||. 10 6.9 | 202 163 
11 9.7 log | «6146 Cid 5.3 | 208 156 
12 8.3 216 | 160 | 12 7.2 | 281 199 
13 8.2 222 183 | 13 8.1 223 206 
14 16.2 229 | 197 || 14 7.7 | 236 230 
15 10.8 254 | 209 || «15 9.2 | 238 231 
16 9.3 263 234 «|| =«16 23.1 264 250 
17 16.5 248 263 «|| 17 26.1 265 251 























The two mirror-drawing patterns were evaluated on comparable groups 
in order to get continuous norms through all ages from six to seventeen. 
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The Tables I and III show a comparison of the ability of white and 
Negro boys from six to seventeen years of age in mirrow-drawing, and 
in making speed and letter making. Table II and IV show a compari- 
son of the same abilities between white and Negro girls from six to 


‘ seventeen years of age. Table V shows the norms in mirror-drawing, 


marking speed, and speed in making letters for Negro boys and girls 


from six to seventeen. 


CONCLUSIONS FROM THE STUDY 


1. Unselected white high school pupils are superior mentally to 
unselected Negro high school pupils, as shown by the tests. 

2. In the simple motor process of marking, there is not much 
difference between the white and Negro children. 

3. In writing, which requires a greater degree of motor-sensory 
coordination, the superiority of the white children is clearly shown. 

4. In the complex motor-sensory coordination process of mirror- 
drawing, the white children are consistently superior to the Negro 
children. 

5. The ability to do mirror-drawing increases rather consistently 


from year to year with Negro children, as well as with white children. 








ASTUDY IN REVERSING THE HANDEDNESS OF SOME 
LEFT-HANDED WRITERS’ 


NORMA V. SCHEIDEMANN 
University of Southern California 


AND 


HAZEL COLYER 
Los Angeles City Schools 


The subject of handedness has long been of interest to psycholo- 
gists, neurologists, eugenicists, and popular writers. Reference to 
professional literature will show that the classroom teacher of the 
primary grades, who has the best opportunity to observe hand prefer- 
ences in non-established or poorly established activities, has taken no 
especial interest in the subject. The majority of elementary teachers 
of a decade ago seemed to have a single rule in establishing writing 
handedness—all children must write with the right hand. Many 
left-handed adults tell us of excrutiating ordeals they were required to 
undergo in trying to ‘‘break up” left-handed tendencies. The 
majority of elementary teachers of today likewise seem to have a 
single rule in regard to establishing handedness in writing—the right- 
handed child should write with his right hand and the left-handed 
child should be permitted to write with his left hand. 

That a child upon entering school, in his eagerness to do exactly 
the right thing, may be influenced by unimportant factors so that he 
may show an unnatural hand preference for writing movements is 
usually not recognized. A child’s lack of critical judgment is mani- 
fested constantly in his imitation of non-consequential factors in hopes 
of attaining significant traits or habits. Thus, a child may quite 
confidently try to smoke in order to become manly like father, or 
try to walk pigeon-toed in order to be like a greatly admired friend. 
Imitation of hand preference for writing may also be made by con- 





1 This study was made during the second semester of 1929-1930 in the second 
grade (IIB) of the Melrose Avenue School, Los Angeles City Schools. This 
school is located in a residential district of Hollywood. The children are nor- 
mal children and come from much above average homes. 

Mrs. Colyer, the second grade teacher of the group studied, interviewed the 
parents of the children and carried out all the remedial procedure in establishing 
a reversal of handedness in writing. 
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scious imitation of an admired playmate or teacher or by unconscious 
imitation when the child is anxiously trying to follow, in most minute 
detail, every movement of one demonstrating a desired habit or skill. 
That first grade children may show unnatural hand preferences for 
writing and that the primary teacher may unwittingly mistake these 
unnatural preferences for innate dominance, was revealed in a recent 
study of a group of left-handed writers. 

Of a second grade group of thirty-four children, sixteen were left- 
handed writers. Both the first and the second grade teachers of the 
group were left-handed; the children’s writing habits were established 
before they entered the second grade. The improbability of so high a 
percentage of left-handed children in an unselected group of normal 
individuals led to a study of these children in order to determine, if 
possible, the native handedness of each left-handed writer and to effect 
a reversal of handedness in cases where that might seem warrantable. 

Since no single test has been devised whereby a child’s native 
handedness can be definitely discovered, each child was given a series 
of tests, one for eyedness and nine for handedness. Native eyedness 
is perhaps the best single indication of a child’s natively dominant 
side. Study of bodily asymmetry has resulted in the recognition of 
dominance of one side of the body over the other, that is, individuals 
are either dextroexpert, generally, as to ear, eye, hand, and foot, or 
else they are sinistroexpert.? 

Hand- and foot-preferences are subject to training, but eye prefer- 
ence, except for accident or disease, remains uninfluenced by experi- 
ence. Hence the eye test was considered the criterional test for native 
handedness, but it was not considered justifiable to effect a reversal 
of handedness on the basis of eyedness only. There are many degrees 
of native handedness, that is, some individuals are strongly right- or 
left-handed; efforts to reverse their hand preferences are of no avail 
and may even be followed by unfortunate consequences. Others 
are so mildly left- or right-handed that they change their preferences 
with the slightest training. Mildly left-handed individuals, because 
they wish to conform to the majority and because most appliances 
are designed for right-handed individuals, usually prefer to train their 





1Scheidemann, Norma V.: A Study of the Handedness of Some Left-handed 
Writers. The Pedagogical Seminary and Journal of Genetic Psychology, Vol. 
XXXVII, Dec. 1930, pp. 510-16. 

2Gould, G. M.: ‘‘Right-handedness and Left-handedness.”’ Lippincott, 
1908, pp. 18-20. 
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right hands. Children showing consistent right hand preference in a 
series of tests for handedness and left eye preference in tests for eyed- 
ness may well be permitted to establish right hand writing habits. 
About twenty-three per cent of presumably right-handed children 
have been found to be left-eyed.!. In regard to left-handedness asso- 
ciated with right-eyedness Ballard? found that of fifty-one left-handed 
individuals fifty-seven per cent proved to be right-eyed; Quinan* 
found fourteen such individuals in a group of twenty-eight. Parsons‘ 
found only four such cases among six hundred eight right-eyed 
children. These findings made us cautions of hasty diagnosis. 

For this study seven of the eight tests used by Haefner® in dis- 
covering the composition of hand dominance of a group of children, 
and an additional test, were selected as being suitable for discovering 
hand preference. Each test was given three times; the hand used in 
two of the trials was recorded as the child’s preferred hand.* In 
summary the test used for handedness were as follows: 

1. Cutting.—The child was required to pick up a pair of scissors placed directly 


before him and to cut very carefully along an irregular line drawn upon a sheet of 
paper. The hand holding the scissors was recorded. 

2. Winding.—A long cord was fastened to a pencil and placed directly before 
the child. He was then required to wind the cord about the pencil. The hand 
doing the winding was recorded. 

3. Throwing.—The child was required to throw a soft ball to the examiner. 
The hand used to throw was recorded. 

4. Receiving.—The child was given a small object. The hand with which the 
object was received, was recorded. 

5. Easy Reaching.—The child was required to reach for a ball placed directly 
before him on a table. The hand used was recorded. 





1 Mills, Lloyd: Eyedness and Handedness. American Journal of Ophthalmology, 
Vols. I-IX, 1925, pp. 106-113. In this study one hundred eighty left-eyed children 
were found among seven hundred eighty-four presumably right-handed children. 

Parsons, B. 8.: “‘Lefthandedness.”” Macmillan, 1924. Parsons claims that 
about thirty per cent of all individuals are left-eyed and would be left-handed 
were it not for the operation of social pressure. 

* Ballard, P. B.: Sinistrality and Speech. Journal of Experimental Pedagogy, 
Vol. I, 1911-1912, p. 298. 

*Quinan, C.: A Study of Sinistrality and Muscle Coordination in Musicians, 
Iron-workers and Others. Archives of Neurology and Psychiatrics, Vol. VII, 1922, 
pp. 352-360. 

‘ Parsons: Op cit., p. 107. 

* Haefner, Ralph: ‘“‘ The Educational Significance of Left-handedness.”” Teach- 
ers College Contributions to Education, No. 360, 1929. 

* This method of scoring was used by Rife, J. M.: Types of Dextrality. Psy- 
chological Review, Vol. XXIX, 1922, pp. 474-480. 
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6. Energetic Reaching.—The child was required to reach for a ball placed at a 
distance requiring an energetic reach. The hand used was recorded. 
_ 7. Thumb Up.—Each child was asked to fold his hands by interclasping his 
fingers. The thumb that was placed on top was recorded. 

8. Batting.—The child was required to hold a bat ready to strike a ball about 
to be pitched by the examiner. The hand that was nearer the batting end of the 
bat was recorded. 


The test for eyedness was as follows: 


9. Eyedness.—A sheet of paper with a small hole torn in the center was placed 
in the child’s hands at about a foot or a foot and a half from his face. A bit of 
crumpled paper was placed upon the floor. The child was directed to look through 
the hole at the bit of crumpled paper. Without permitting the child to move the 
sheet of paper, the examiner placed her hands alternately over the child’s left 
and right eyes. The child reported whether or not he was able to see the bit of 
paper when one of his eyes was covered. Failure to see the bit of paper, indicated 
that the dominant eye was covered. 


These nine tests were given to the sixteen left-handed writers. 
The following table gives the results of the test findings. 


HAND- AND EYE-PREFERENCES OF SIXTEEN LEFT-HANDED WRITERS OF A SECOND 
GrapvE Group or TurrtTy-FrouR NorRMAL CHILDREN 


Test 
: 
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According to the test findings, we felt confident that ten of these 
sixteen left-handed writers should be using their right hands for writ- 
ing, namely: Ervin, Shirley, Marvin, Yvonne, Vera May, Richard, 
Melville, Cecile, Mary, and Anita. Carl and Jean, were, perhaps, 
very mildly left-handed and under other conditions would undoubtedly 
have established right-handed writing habits. 

Before any procedure to reverse the handedness of any child was 
begun, the second grade teacher called the mothers of the children 
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to the school individually. The situation was explained and the series 
of nine tests was given to the child in the mother’s presence so that 
the mother could see her child’s hand preference in acts other than 
writing. In all cases the mothers were very much interested and in 
all but one case, were eager to cooperate in reversing the child’s handed- 
ness in writing. 

The particular methods employed in effecting a reversal of handed- 
ness were different for each child. Some children needed scarcely 
more than an explanation and some enthusiasm over the early right- 
handed writing attempts. Other children required more encourage- 
ment, frequent reminding, and general procedures for training the 
right hand much like those given to beginning writers. In many cases 
the first attempts with the right hand were better than the child’s 
left-handed writing. Most of these children were using their right 
hands easily and naturally within two or three weeks, and all but one 
were doing so at the end of the semester (after a period of six weeks). 
A few weeks after the opening of the fall semester, the teachers of the 
children whose handedness had been reversed were visited and ques- 
tioned in regard to the handedness of the children. All the children, 
but the one who had had difficulty in the spring semester, were using 
their right hands. 

During each conference with the mother and later during a ques- 
tioning of the child, an effort was made to determine the factors that 
might have influenced the child in using the left hand for writing. 
The immediate influences for left hand writing in the ten cases defi- 
nitely right-handed and the two cases who could use their right hands 
to advantage, seemed to be as follows: 


Influenced by left-handedness of: 
rE ee Ln ow kwak ie edb ae b ¥G 00 oe kab oe 1 
a ES kk se henge ss dun ede enedd bean 1 
eg. cnc wceccscsbuceeececccees 4 
Statement of second grade teacher......................... 1 
es ee bse b ue hebeadeeeed vec és 5 


It is of interest to note that in four cases the mothers did not know 
that their children were writing with the left hands. The slight 
incidents that caused some to use their left hands is noteworthy. 
Thus, when the second grade teacher remarked: ‘‘ We seem to have an 
epidemic of left hands,’’ one boy decided he, too, would use his left 
hand “‘just to make one more.”’ Another child said she began to use 
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her left hand because her playmate broke her right arm and found it 
hard to use her left hand for writing. The child said she did not want 
“‘to be caught like that.” 

Although the left-handedness of the first grade teacher was found 
to be a direct influence in establishing but one case of left-handed 
writing, still we may safely assume that indirectly the teacher’s 
left-handedness played a very important réle in establishing left- 
handed writing habits among these children. A left-handed teacher, 
especially, if she had unhappy experiences or difficulty in trying to 
establish right-handed writing habits, is, perhaps, more apt to permit 
a child to establish left-handed habits than is a right-handed teacher, 
Left-handed teachers may be prejudiced toward having a child use 
his left hand just as right-handed teachers may be prejudiced toward 
having a child use his right hand. 
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THE SIGNIFICANCE OF A DIFFERENCE BETWEEN 
“MATCHED” GROUPS 


E. F. LINDQUIST 


State University of Iowa 


A type of experimentation very frequently employed in education 
and psychology is that which makes use of what are commonly known 
as “‘matched groups,” or as “‘matched control groups.” It is the 
purpose of this article to draw attention to an important error in statis- 
tical analysis that has been almost universally characteristic of the 
reports of such experiments, and to suggest an improved statistical 
procedure and discuss its possibilities. 

In order to remove any possible ambiguity that may surround the 
term, ‘‘matched groups,” it may be well to begin with a specific illus- 
tration. Let us consider an experiment to determine the relative 
effectiveness of the “additive” and the ‘‘take-away” methods of teach- 
ing subtraction in third grade arithmetic. The usual procedure in 
such an experiment would consist, briefly, of teaching one group of 
pupils by the “‘additive” method and another group by the “take- 
away” method and then of securing, after the period of instruction, a 
final measure of ability in subtraction for the pupils in both groups. 
The method then considered the better would be that under which the 
pupils showed the higher average final ability in subtraction. 

In order that this procedure be most valid, it is essential that all 
factors influencing the final measure, except the one factor under 
investigation—that of method, be kept as nearly as possible the same 
for both groups. One of these factors most important to control is 
that of the initial ability of the pupils to profit by instruction. To 
control this factor, it is customary to “‘match” the two groups with 
respect to some measure of initial ability. This is done by so selecting 
the two groups that for each pupil in the first group there exists a 
corresponding pupil in the second group identical with him with 
respect to this initial measure. In the case of this specific illustration, 
this initial measure might have been the score on a general intelligence 
test, or the score on a survey test in arithmetic, or some other similar 
measure. If the intelligence test score had been used, the two groups 
would have been so selected as to have identical distributions of 
intelligence test scores—in other words, they would have been 
“matched” on the basis of intelligence. 
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Now it is a well known fact that, even though the two methods are 
equally effective in general (7.e., for the entire population of Grade III 
pupils), a difference in the average final scores of the two matched 
groups would almost invariably be found in a single experiment of this 
kind. Assuming equal ‘“‘true”’ effectiveness, this would nevertheless 
occur because the experiment dealt with limited samples, and because 
differences due to chance in the selection of the samples would alone 
account for different performances on the final test in subtraction. It 
is therefore the responsibility of the experimenter, before drawing 
any general conclusions from a single obtained difference, to demon- 
strate objectively that the difference he obtained was larger than could 
reasonably be accounted for by chance, or sampling, errors. To do 
this he must know how large is the expected value of the difference 
that chance alone would account for in an experiment of this kind. 
In other words, he must know the Probable Error of the Difference. 

In nearly all reports that have yet been made of experiments of 
this type, the formula that has been employed to determine the value 
of this probable error has been the familiar PE,,, formula 


PEan, = “PE:? + PE:? (1) 


in which the PE’s under the radical are those of the respective means 
of the final scores for the two matched groups, each of which have 
been found through the use of the formula 


o 

PEam ate (2) 
This Formula for the Probable Error of the Mean Is Not Valid for 
Use with Matched Groups.'—It is based upon the assumption that the 
samples used are strictly random selections from the populations they 
represent. This assumption is not applicable to matched groups. 
The process of matching on the basis of a measure which is correlated 
with the final measure destroys the randomness of the samples with 
respect to this final measure. The probable amount of sampling error 
in the obtained difference, instead of being as large as that indicated 
by the formulas given above, is usually considerably less, in some 

cases by more than fifty per cent. 
The reasonableness of these last two statements may be made more 
clear by the following illustration, based in part upon the one previously 








1 Except in the very rare case where the measures used for matching show n0 
correlation with the final measures. 
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used. Suppose that a large number of samples, each of the same size, 
and each strictly random (not matched), were selected from the entire 
population of Grade III pupils. Suppose the pupils in each sample 
were taught under identical conditions by the same method, and then 
measured by the final test in subtraction. Variations in sampling 
would then result in a variation in the values of the mean scores of 
these samples on the final test. The standard deviation of these mean 
scores would be the standard error of any one such mean and, in this 
case, since random sampling is specified, this standard error could be 
validly measured by the usual formula for the standard error of the 
mean. 

Now let us consider a similar case, but one in which ‘‘matching”’ 
isinvolved. If, in the first case (of random sampling) an intelligence 
test had been given to the same samples, a variation would also have 
been found in the mean scores on the intelligence test, again due to 
sampling errors. Now since intelligence is correlated with scores 
on the subtraction test, a sample that due to chance showed a relatively 
high mean score on the intelligence test would also show a relatively 
high mean score in subtraction. If, therefore, a change were arbi- 
trarily made in that sample so as to bring the mean intelligence score 
down to the mean of all samples, the effect would be to bring the mean 
subtraction score of that same sample nearer to the mean subtraction 
score of all samples than it was previously (in the case of random 
sampling). A restriction of the variation in mean intelligence scores 
to zero (the effect of identical matching) therefore has the effect of 
decreasing the chance fluctuations in the mean subtraction scores. 
In other words, the effect of matching is to reduce the value of the 
standard error of the mean subtraction score of asample. The amount 
of this reduction would depend upon the degree of relationship between 
intelligence scores and subtraction scores. 

What is needed, therefore, is a new formula for the standard error 
of the mean of matched samples that takes into account this restrictive 
influence of matching upon chance fluctuations in the mean. The 
type of reasoning already indicated led the author to suggest the 
problem of deriving such a formula to Mr. Samuel 8. Wilks, a graduate 
student in mathematical statistics at the State University of Iowa. 
As a result the desired formula was obtained, and is as follows: 
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in which oy represents the standard error of means of matched groups, 
It will be noted that this formula differs from the usual formula only 
in that the right-hand member is multiplied by the radical ~/ 1-7, 
in which r is the Pearson product-moment coefficient of correlation 
between the measures used as the basis for matching and the measures 
for which the mean is computed. In the case of the illustration, r is 
the correlation between intelligence test scores and subtraction 
scores. 

Since the original derivation of the formula by Mr. Wilks, the 
author has devised a simpler form of proof, which is reproduced here. 
This proof is not mathematically so rigid as that provided by Mr. 
Wilks, but it will perhaps prove easier to understand for the reader 
not highly trained in mathematical statistics. For the sake of further 
simplification, this proof will be here set in terms of the specific 
situation already used in the previous illustrations. 

Suppose that we have, for a very large number of pupils selected 
at random from the Grade III population, the scores on an initial 
intelligence test and the scores on a final subtraction test given after 
a course of instruction by one of the two methods described. From 
this large group it would of course be possible to select a number of 
pupils all of whom made exactly the same score on the intelligence 
test. These pupils would then be “‘matched” on the basis of intelli- 
gence. It would then be possible to predict, without direct calculation, 
the standard deviation of the scores of these selected pupils on the 
subtraction test. This could be done through the use of the familiar 
standard error of estimate formula 


oy = Ov l — ri (4) 
in which oc, represents the standard deviation of subtraction scores 
for the selected pupils, o, represents the standard deviation of subtrac- 
tion scores for all pupils, and in which 7;, represents the correlation 
between intelligence scores and subtraction scores. 

Now, instead of thinking of the scores of individual pupils, suppose 
that we have, for a very large number of samples, all of the same 
size and all selected strictly at random from the Grade III population, 
the mean scores on the intelligence test and on the subtraction test. 
Again, as in the case of individual pupils, it would be possible to select 
a number of samples all showing the same mean score in intelligence. 





1 The original, and more rigid mathematical derivation of the formula will be 
found in an article by Mr. Wilks in this copy of the JouRNAL. 
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These samples would then be ‘‘matched” with respect to intelligence. 
Again it would be possible, through the use of the standard error of 
estimate formula, to predict for these selected samples the standard 
deviation of mean subtraction scores. This time, however, formula 
(4) becomes 





Cas ou,V1 - T wim, (5) 


in which cy, represents the standard deviation in the mean subtraction 
scores of the selected (matched) samples, in which cy, represents the 
standard deviation in mean subtraction scores for all samples, and in 
which ry,v, represents the coefficient of correlation between mean 
scores in intelligence and mean scores in subtraction for all random 
samples. 

Now the standard deviation in mean subtraction scores for all 


random samples is given by the usual formula for the standard error 
of a mean 


ou, = Ti (6) 


in which o, represents the standard deviation in the subtraction 
scores of individual pupils in a sample, and in which N represents the 
number of pupils in the sample. 

The coefficient of correlation between the means of a series of 
samples has been shown by Kelley’ to be equal to the coefficient of 
correlation between the scores on which the means are based. Hence, 
in this case, 

TmjM, = Tis (7) 


in which r;, represents the correlation between intelligence test scores 
and subtraction scores. 


By substituting from (6) and (7) in (5) we secure the formula 


ous = vd r*., (8) 


which, by dropping the subscripts peculiar to the illustration, becomes 
the formula already given as formula (3). 

Substituting the value of the standard error of the mean of matched 
groups as given by formula (3) in the usual formula for the standard 
error of a difference between means, we secure the following as the 








1 Kelley, Truman L.: ‘‘Statistical Method.” P. 178, formula (118). 
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correct formula to employ for determining the standard error of a 
difference in obtained means for two matched groups 


con. = f( F7. + HA - 7D 9) 


where o; and oz represent the standard deviations in final scores of the 
respective groups, where N,; and Nz: represent the number of cases 
in each, and where r represents the Pearson product-moment coef- 
ficient of correlation between the measures used as the basis for match- 
ing and the measures between which the final comparison in means is 
made. 

It is obvious, from an inspection of formula (9), that the true 
standard error of an obtained difference between matched groups 
might be considerably less than that yielded by the usual formulas, 
and that hence many differences which have been considered account- 
able for by sampling errors might have been statistically significant. 
Fortunately, the incorrect use of the formulas based on the assump- 
tion of random sampling resulted in errors only in the direction of 
conservativeness. That fact, however, is no valid excuse for their 
continued application, especially when the correct technique proves 
so simple in application. 

It is important to note that formula (9) does not depend upon the 
relation of the mean initial scores of the matched groups to the “‘ true” 
initial score of the entire population which is being considered. For 
example, if the matched samples of the original illustration had due 
to chance been above or below the average of all Grade III 
children in intelligence, the formula would still apply. This follows 
by analogy from the fact that the standard error of estimate formula 
applies to all values of the variable from which the estimates are made. 

The proof of formula (9) that is contained in this article also 
suggests that the formula applies even though the exact distribution 
of initial scores is not the same for both matched groups, if only their 
mean initial scores are the same. This indicates that the formula 
should be valid for use with groups that have not been matched “pupil 
for pupil,’’ but in which the means and standard deviations alone have 
been equated. A more rigid mathematical proof of this proposition, 
however, should be provided before much confidence is placed in it. 

It is also important to note that formula (9) does not indicate 
how far the obtained difference between two matched samples is 
likely to deviate from the difference that would have been obtained 
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had the entire population been measured, but tells only how far the 
obtained difference is likely to deviate from the difference that would 
have been found between infinitely large groups showing the same 
distribution of initial measures as that matched samples that were used. 
Where the purpose of experimentation, however, is simply to deter- 
mine (as is usually the case) whether there is or is not any real differ- 
ence between the two methods of learning or teaching (7.e., whether 
or not the obtained difference is statistically significant, and where 
the absolute amount of the difference for the entire population is not 
demanded) this latter caution is not of very great practical impor- 
tance. If, for example, a method is truly superior for pupils at one 
level of intelligence, it is certainly likely to be superior for pupils at 
another level not far removed from the first. If reasonably large 
samples are used (N > 30), and if the first sample is selected roughly 
at random from the entire population and the second matched with 
it (a usual procedure), it is not reasonable to suppose that the difference 
in intelligence between the matched groups and the entire population 
(due to sampling error in the first sample selected) will be large enough 
to have any effect upon the validity of conclusions concerning the 
general relative effectiveness of the methods investigated. 

It should hardly be necessary to point out that formula (3) is 
useful only to measure sampling fluctuations from sample to sample 
in the means of samples that have been matched with respect to a 
related variable. It in no sense, therefore, supplants the usual for- 
mula for the standard error of a mean, since that formula is in no case 
valid in the specific situation to which formula (3) applies. As far as 
most practical applications are concerned, formula (3) need not con- 
cern the investigator at all, formula (9) being the only one that need 
be directly applied in the usual “‘matching”’ experiment. In the case, 
however, where the effect of the methods compared is to result in 
different values of the correlation coefficient between initial and final 
measures for the two groups, it is necessary to apply formula (3) 
independently to the mean of each sample, and then substitute the 
resulting values for the standard errors of the means in the usual for- 
mula for the standard error of the difference. Formula (9) is given 
in combination form for the situation where any difference in method 
does not seriously affect the correlation between initial and final 
scores, which condition usually exists. Even though the value of r 
is slightly different for the two groups, formula (9) will still yield a 
useful and close approximation to the true standard error of the differ- 
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ence. It is considered desirable to propose formula (9) in combination 
form since it eliminates the necessity of calculating the standard error 
of each mean independently, and hence avoids the danger that the 
less critical investigator will attempt to interpret these standard 
errors in the way in which the usual standard error of the mean is 
interpreted. 
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THE STANDARD ERROR OF THE MEANS OF 
“MATCHED” SAMPLES 


SAMUEL S. WILKS 
State University of Iowa 


The problem of determining the variation in the mean of one 
character of the items of a sample when the distribution of another 
correlated character is made identical for all samples, item by item, 
with an arbitrary distribution, was suggested to me by Prof. E. F. 
Lindquist of the State University of Iowa. A discussion of the use 
and importance of the theory as a statistical technique in certain 
types of experimental work will be found in an article by Lindquist 
in this issue of the JouRNAL. In this paper I shall consider only the 
mathematical derivation of the expression for this variation. 

In order to state the problem more accurately, we may describe 
the type of sampling involved in the following manner. Suppose a 
sample of N items is drawn from a population in such a way that the 
distribution of a particular character X of each item is made identical, 
item for item, with a given X-distribution. This selection or matching 
of X’s is made at random relative to a second character Y and the only 
case of general interest is that in which there is a correlation between 
Xand Y. The question naturally arises: What is the expected varia- 
tion in the mean! of the Y’s due to such sampling? It should be noted 
that the given distribution of X may be of an arbitrarily selected form 
or it may be derived at random. The results, as we shall see, are 
independent of the form of this distribution. 

Let us assume that X and Y are normally distributed, and that a 
correlation of r exists between them. As was previously stated, we 
can obtain more general results at once without making this assump- 
tion of normality, but it is the first one made because of the greater 
practical interest in the case of a normal distribution. For con- 
venience, let us think of the X-Y plane as being divided into small 
squares. Then in any finite sample of N items there are only a finite 
number of squares into which the points (z, y) (each representing an 
item of the sample) will fall. The distribution of Y’s associated with 
a given value x, of x which is taken as the mid-point of the p-th inter- 





11 have obtained expressions for the variation in the standard deviation of Y 
and in the correlation coefficient r between X and Y, but have not included these 
in this article. They will be made available later in my Doctor’s thesis. 
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val of X’s is called the xz, array of Y’s. Similarly y; is the midpoint 
of the i-th interval of Y’s. Of the N pairs of X and Y, ny will have 
the value of the X character in the interval x, which form the z, array 
of Y’s. This array will have its mean and its standard deviation, 
which we will denote by 9, and a, respectively. The mean of all Y 
characters will be 7 and their variability will be given by oj. The 
number of the N items falling into the square with center (z,, y;) will 
be denoted by npi. 

Let 5 denote the sampling deviation of a variate from its true value. 
Then since the distribution of X’s is constant from sample to sample, 
it follows that the deviation én, is zero. We shall make use of the 
propositions that if s and s’ are the frequencies in any two mutually 
exclusive non-independent (when either or both are subjected to 
random sampling variation) frequency classes of a frequency dis- 
tribution of M elements the standard deviation of s due to random 
sampling is given by 


E(3,?) = 0,7 = (1 - M) (1) 


and the correlation r,,, between deviations in s and s’ due to random 
sampling is given by 
= ; 
E(és ° 5s’) = Te2'F 30s’ = -F (2)! 


Now it is obvious that we may write 


NG = S,[Si(npys)] (3) 
where the sums are taken over all values of p and 17. 
Taking the variation we have 


Nog = S,[Si(dnpy:)] (4) 

squaring 
N*(8g)? = {Sp[Si(dnpys)]}? 

= S,[Si(dnpys)]? + Spp[Si(inpy,)S (inp yy)] (5) 
where S,, denotes the summation over all values of p and p’ except 
for p = p’. | 

Expanding again, 

N*(5g)? = S,[Si(5n7py.7)] + Sp[Si;(ingsdng yy ;)] 

+ Spp[Si(inpsinpy:?)] + Spp[Sis(inpiiny yys)] (6) 
where the meaning of S;; is obvious, 





1 For proof, see Rietz, H. L.: “‘ Mathematical Statistics.”” 1927, p. 119. 
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Summing and dividing by the number of possible samples of this 
kind, and using the proposition that the expected value of a sum is 
equal to the sum of the expected values, we get 
N05? = SplSi(E(5nys*)y?)] + Sp[Si(E(Snpidnp;)yy;)] 

+ Spp[Si(E(Sngiinys)y:”))] + Spp[Si(E(inpiiny yy). (7) 
But from (1) it is evident that 





Sp[S:(E(8ip:2)y.2)) = 8,] Sann{ . ue | (8) 
and by (2) 
Sp[Sis(E(dnpsinp,)yys)] = -8, s.( "va. (9) 


Let us consider the expression E(én,;5n,;) for p different from p’. 
Since 7, is invariable from sample to sample, it follows that the devia- 
tion 6n,; for any 7 within the group n, cannot affect the deviation 
in, which is zero and therefore cannot affect the deviation 5n,,; of the 
sub-group np; in any other group n, for p different from p’. Hence 
the deviations 6n,; and 6n,; are independent for all values of p and 
p’ provided p is different from p’. 

Hence 

E(dnpibny;) = [E(dng;) - E(dnz;)). (10) 


Thus the last two sets of terms of (7) vanish, and combining (8) 
and (9) and placing this value in (7), we get 
[Si(mpiys)]}? ) 


Np : 





N*o7? = Spi [Si(npy*)] — 


= S,[Si(npiyi?) — nef”) 
= S,[Si(npi(yi — 9)*)). (11) 
But 
Silnpi(ys — Gp)*] = nyo,’ 
hence 


N05? = S,[nzo,"). (12) 
Since we have assumed normalfdistribution, and thus a homo- 
scedastic system with linear regression, all arrays of ,Y’s have the same 


standard deviation o,+/1 — r’. 
Then 


S,[ngop*] = Nay? = No,?(1 — 1’). 


a Se eee 





Se ee ee a re 


Ba oe i ae F a Peek Slade Faron er Sete se sini 


eee a. 
—— ee ee ll 








{oo ever 


ra aoe 


vate ® 








[ae 5 > ee Se etl” > ae ota 
a | aa Rie 2s 5 i eG GE PELE HD bee ee Bnew 7 fet HEE terest 59 


eer aa 


SOUS AMO 92 


Meee BS i 


Sah ce he iE alg i ee 


ek et Tce inte FTE ead lt npc. aes tl a an, tt Bl, Falta de alg: REM NE SAE GN PE en cls A cp a alge EEE TS ae? BERL a uc ll ig 
« > ef Ta « toute < < gs: ut 0.8 f.nic PRS ml ed el > a Ye jeRias > x ASE < =2 


* Be eee BS Og MPT HS pts Ey Sy ISS = 
BES SOE Be rd Cat ep cae 
aes me - 


Eos 


EEL. CPI POA OLE IE! EE: 
3 & x s ~ = 2 a2. 5 


hte? 


tay ON ae a ee 


oe is ST ee 
a 
Le Ts 


208 The Journal of Educational Psychology 


The standard error of the mean @ is finally 


- o- oyV 1 — r? 


which we note again is independent of the distribution of X in the 
sample. It must not be construed, however, that the expected value 
of the mean of the Y’s is independent of the distribution of the X’s, 
I state, without giving proof here, that the expected value of 7 is given 
by 





9 = yo +1" — a) (14) 


where Zo and yp are the true means of the X’s and Y’s for the entire 
population and Z is the mean of the given distribution of X’s. 

By similar methods it is not difficult to show that if all of the 
items of a sample are selected or matched on s — 1 characters, where 
s characters are normally distributed and not independent, and with 
linear multiple regression, then the standard error of the means of the 
character X, is given by 


meee avV 1 — Te12 + + + ot 
5 VN (15) 








where 1,.12 . . . »—1 is the multiple correlation coefficient of order s 
— 1 of X, with the s — 1 variables. 

It may be of interest to know that all of the results contained in 
this paper have been confirmed independently through the application 
of an entirely different and somewhat more rigorous method of proof, 
which will be made available in a thesis soon to be published at the 
State University of Iowa. 
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RELIABILITY OF INTEGRATION INDEX 
DIFFERENCES 


») JOHN W. DICKEY 
he State Normal School, Newark, N. J. 
- The formula 
s, 
“ K=~ (1) 

has been presented,’ by the writer, as a quantitative measure of pupil 
4) integration within the public schools. The formula giving the relia- 
| bility for separate indices has also been reported.” 
mn The necessary comparison of indices (whether made by the same 
i population on different tests or on different forms of the same test, or 
made by uncorrelated populations on different tests or the same test) 
“ is statistically impossible without the use of formulas which yield the 
ith sia ie roy 
*" reliability of their differences. . 

It is the purpose of this paper to derive the standard error and the 

probable error formulas which are necessary in the comparison of 

5) indices for both the correlated and the uncorrelated groups. 
Let, 
rs M, = mean gross score on test one 
M; = mean gross score on test two 

™ o, = standard deviation of gross scores on test one 
o;, = standard deviation of gross scores on test two 
on ri: = correlation between test one and test two 
of, N = the population 
the 


M ; 
K, = og = Integration Index on test one 


K,= a = Integration Index on test two 


A = K, — K, = an index difference 
The standard deviation of such index differences, as 


A=—?- = (2) 


is the required formula. 





’ An Index of Integration. Journal of Educational Psychology, Vol. XX, No. 9, 
Dec., 1929, p. 625. 
* Note on the Reliability of the Index of Integration. Journal of Educational 
Psychology, Vol. XXI, No. 3, March, 1930, p. 231. 
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By writing the total derivative of equation (2), we obtain the 
differential equation 








oodM >, — M-do.z +p M do; — o,aM, 


dA = o3 o2 


(3) 


When squaring, summing and dividing by the theoretical infinite 
population, we get the variance of A in the form 


_ O20 4, + M*,0%,, — 0 o710m, + M?,0",, — 0 











a o24 7 o\4 
MM 20606,%c0, — %1920u,ous um, +9 +0 
+= 1 17 mr pe M,! M,\M ) (4) 


The zeros in equation (4) occur because the correlation between 
the two independent variables, M and ¢, is zero. 
The following simplified formulas are the result when we substitute! 


. . M 
oC oC 
Tm,M, = 112; Toe, = Tr 19, ou = N’ o*, = ON’ and K = _ 
in equation (4) and reduce to a convenient form. 


ox? = gold + Kit +Ke? — rix(KiKaris + 2)) 


whence 





0 (K,—K,) = a4 + K;? + K;? ro T12(Ki Korie + inl (5) 


and 
0.6745 


V2N 


In case the reliability of index differences (which occur when 
different forms of the same test are used) is to be substantiated, the 
correlation coefficient is at once the reliability coefficient of the test. 

If the reliability of index differences (for uncorrelated groups) is 
to be investigated, formulas (5) and (6) reduce immediately to the 
formulas 





PE ¢x,-x, = pete OS & + K;? + K?? Bie rie(Ki Korie T 2) (6) 








; a 2 + K,?? 2 + K,? 
9 (K,—K,) fr 4/2 Ni + N; (7) 


1 Kelley, T. L.: ‘Statistical Method.’”’ New York: The Macmillan Company, 
1923, p. 178, formulas 118 and 121. The assumptions, in the case of to ="; 
of rectilinearity, homoscédasticity, and equal kurtosis, do not vitiate our findings 
since a variation of forty points in the correlation coefficient is required to create 
around a ten per cent error in formulas (5) and (6). 
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and 








NATE: 2+Ki | 2+K;? 
Ex, K;) 0.47694) M, ot Ny 2 (8) 


where N; and N; are the respective populations. 
Formulas (5) and (6) are used when indices are compared for any 


given population; whereas, formulas (7) and (8) are used when indices 
are compared for entirely different populations. 
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PARENTAL AGE AND INTELLIGENCE OF OFFSPRING 


MINNIE LOUISE STECKEL 
University of Chicago 


The object of this study was to disclose, if possible, any relationship 
which might exist between the intelligence of children and parental 
ages at the time of the birth of the child on the basis of findings on a 
large population by the use of intelligence tests. 

The data upon which this study is based were obtained from 
public school children at Sioux City, Iowa and their parents during 
the school years 1926-1928. The study includes records of children 
from Grade I to Grade XII inclusive. Intelligence ratings of the 
children were obtained by means of group intelligence tests. Ques- 
tionnaires were sent to the parents asking for their own ages, and 
the ages and birth order of all their children. 

Four different group tests were used in order adequately to cover 
the age range of intelligence. The Kuhlmann-Anderson Test was 
used for grades I to Junior III; the National Intelligence Test for 
grades Senior III to Senior IV; the Otis Intermediate Test for grades 
Junior V to Senior VIII, and the Otis Advanced Test for grades 
Junior IX to Senior XII. Several thousand children were tested with 
each test. It was possible, therefore, to restandardize on the same 
basis the intelligence quotients as obtained by each of the four tests, 
so that the results of all four tests are directly comparable. The 
standard score is calculated in the following manner: 


g 19 -M 





in which § is the standard score, M is the mean intelli- 


gence quotient for the year group of the child’s age and ¢ is the standard 
deviation of the intelligence quotients for that age group. By using 
the intelligence quotient upon which to base the calculations rather 
than the raw score, the measure is uncorrelated with age. By trans- 
muting the intelligence quotients of each test into standard scores the 
possibility of differences which might arise due to the relative difficulty 
of the tests at the various age levels is eliminated. The constant 5.00 
has been added to the standard scores as calculated in order to avoid 
negative values. Thus a child whose rating is 5.00 has an intelligence 
rating equal to the average of all children of Sioux City of his age. A 
child with a score of 3.80 has a standard score of — 1.20 for children of 
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his own age group. Only records of normal children of the Caucasian 
race are included in this study. 

Some parents may have stated their own ages incorrectly. There 
was, however, nothing compulsory about answering the questionnaire, 
therefore these cases would be so few comparatively that they would 
scarcely affect the validity of the study. In Table I columns C and 
D show the distribution and mean intelligence of the children for 
each two year age period of the parents. Figure 1 presents.graphically 
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the relationship between the mean intelligence of children born during 
each two year age period of the mothers. Curve C shows this 
relationship for the first-born children, and Curve D for all children 
regardless of birth-order. Both curves indicate that children born 
of mothers who are less than approximately twenty-six to twenty-eight 
years of age are, in general, less intelligent than children born of 
mothers who are this age or older. 

Curves A and B of Fig. 2 shows a similar relationship between 
intelligence of children and paternal ages. The chief difference is 
in the fact that the curves representing intelligence of children and 
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paternal ages indicate that children born of fathers under approxi- 
mately thirty or thirty-two years of age are in general less intelligent 
iW than children born at this paternal age or older, whereas (as has been 
ea | said) this period with respect to maternal ages extends approximately 
| only to the twenty-sixth to twenty-eighth year. 
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and parental ages are consistently higher than curves D and B repre- 


: | Although the curves indicate periods of very high intelligence 
a after twenty-eight and thirty-two years for the maternal and paternal 

ea ages respectively, these cases are at the extreme of the curve and are 
uF represented by a relatively small number of cases. In general, after 
i ae these age periods, there is a slight drop in the curves representing 
a children’s intelligence. 
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senting the intelligence of all children regardless of birth-order. ) 


culties accompanying child birth in immature and in elderly mothers 
might produce so great initial handicaps upon their children that 


i i { These facts as presented might have several possible interpreta- 

Wt | tions. It may be conjectured that whatever causes the vitality of 

if i ; the human body to increase as maturity is approached and to diminish 

i 7 with advancing age may also affect the uniting reproductive cells and 

ay have a deleterious effect upon the mentality of the offspring. Another 

i possible interpretation may be that the extraordinary somatic diff- 
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they may never be fully compensated by the time the children 
reach maturity. 

It might be that the differences in environmental factors for chil- 
dren of very young or elderly parents as compared to those parents 
of the intervening age period are pronounced enough to affect to an 
appreciable degree the rating of children on an intelligence test given 
at school. : 

Possibly the explanation of the results of this study is to be found 
largely on a nationality and socio-economic basis. Grouping parents 
according to age necessarily involves nationality, and socio-economic 
groupings also. Although the data are limited to the Caucasian 
race, either one or both of approximately one-fourth of the parents of 
the children are foreign-born. These are largely of Russian, Lithuan- 
ian, and German extraction. There is a tendency for European stock 
to marry young. It also is probable that children of immigrants, 
on the whole, would rate a little lower on an intelligence test than would 
children of native-born parents. 

The army examinations showed that the various occupational 
groups draw men of different intelligence levels. These occupational 
groups, therefore, constitute social classes based on intelligence. 
Studies by Hirsh,' Kornhauser,? and Haggerty and Nash,* show 
that the occupation of the father also is a fair index of the intelligence 
of his children, i.e., that there is a very real association between 
parental occupation and intelligence of offspring. 

Children of very young parents are children of parents whose 
occupational choice does not demand college or professional training. 
Many of these parents do not even go to high school. They drop 
out of school as soon as attendance laws permit or as soon as labor 
laws permit them to work. Not that going to college on the part of 
the parents appreciably increases the intelligence of their offspring 
but the parents in occupations which do not demand a long period 
of training are of a stratum of society which lives on a lower economic 
basis. The more intelligent children are offspring of parents who 
have gone to college and professional schools. These parents of 


1A Study of Natio-racial Mental Differences. Genetic Psychology Monograph, 
1926, pp. 239-407. 

The Economic Standing of Parents and the Intelligence of Their Children. 
Journal Educational Psychology, Vol. LX, 1918. 

* Mental Capacity of Children and Parental Occupation. Journal Educational 
Psychology, 1924, pp. 559-572. 
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necessity marry later. Their children are not more intelligent because 
the parents are older and possibly not because their parents have 
had more education but rather because their parents come from a 
higher, more intellectual stratum of society. The scale of living of 
this stratum of society demands that marriage and reproduction be 
delayed until the parents are economically able to maintain their 
family on a scale of living to which they themselves have been 
accustomed. 

The explanation of the fact that the intelligence of the child cor- 
relates as closely with the father’s age as with the mother’s age (except 
that the period of greatest intelligence for the child comes three or 
four years later for the father’s age) might also be explained on a 
socio-economic basis rather than on a hereditary or biological basis 
accounted for in the fact of the later maturing of the male parent. 
The demand made on the father in support of his family delays his 
marriage and his reproductive period several years later than the 
mother’s in order that he may first establish himself economically 
and socially. That there is the same relationship between intelligence 
of children and paternal ages as between intelligence of children 
and maternal ages might be explained in the fact that the paternal 
ages correlate positively with the maternal ages and therefore bear 
a similar relationship to the intelligence of their offspring. 

The seemingly greater intelligence of first-born children, as indi- 
cated by the higher curves A and C, undoubtedly has its explanation 
on a socio-economic basis. It is generally understood that the lower 
economic classes have larger families than do the professional classes. 
The children coming from the lower occupational groups also have 
lower intelligence ratings. As long as we consider only the intelli- 
gence of the first-born children, each family is represented only once. 
The superior intelligence of the children of the professional classes 
raises the intelligence level of the entire group. In curves B and D, 
representing the intelligence of all children against parental age, the 
relatively lower intelligence of the greater number of children from 
families of lower socio-economic groups lowers the mean intelligence 
of the entire group. The greater intelligence of children represented 
in curves A and C is to be explained, then, not in the fact that they 
are first-born children, but in the fact that each family of the lower 
occupational classes has no more representatives in the group than 
each family of the professional classes. 
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If late marriage and reproduction accompany a rising economic 
status, then the children of elderly parents should exceed the children 
of middle-aged parents in intelligence. A slight drop of the intelli- 
gence curve of children of elderly parents might conceivably be due 
to biological factors such as poor productive stock with a low fecundity- 
fertility ratio or to the physical decline of the older parent. The 
seemingly lower intelligence of children of this group, as indicated by 
the results of this study, might also find its interpretation on a socio- 
economic basis; for not only do parents of the lower economic classes 
reproduce at an earlier age than do the professional classes but there 
is a tendency for child-bearing to be continued throughout the entire 
reproductive period of the mother of the laboring classes while for 
the professional classes child bearing is restricted toward the close of 
the reproductive period as well as at its beginning. A slight decrease 
in intelligence of children of elderly parents, as indicated in the study, 
might be accounted for in the fact that this part of the curve again 
represents a preponderance of children from the lower occupational 
groups. Undoubtedly this drop in the curve of intelligence would 
be more pronounced were it not for the fact that within the same 
family intelligence increases with ordinal number as shown in an earlier 
study.! 

The relationship between parental ages and intelligence of their 
children as indicated by the results of this study is doubtless a true 
relationship. The conclusion that there is a direct causal relation- 
ship between parental ages and the intelligence of their children is 
not at all justified by the results of the present study. The writer 
makes no claim to having established the cause or causes underlying 
the results of the study. 

If the population in question were first classified into occupational 
groups and the age of parents and intelligence of children compared, 
any differentiation might indicate biological factors operating which 
are dependent upon parental age. Present indications are that in 
such occupational groups the variation in intelligence of the children 
would be no greater than the occupational variation of the parents 
within each group. 

A subject closely associated with parental age and intelligence 
of offspring is amount of disparity between the ages of the two parents 
as compared to the intelligence of their offspring. Custom and general 





1 Steckel, Minnie Louise: Intelligence and Birth Order in Family. Journal 
of Social Psychology, August, 1930. 


i] 
if 
<e 


% 


ee x a ee 


r 
3 

4 
: 











le 


a EE a | |= Fe Ort ert 































































































PLE ‘OT ‘THIOL 099 ‘€ ‘T8301 Z8o ‘OT ‘T8901 99‘ ‘T8301 08S ‘OT ‘T830,.L 099 ‘€ ‘T8301, 
oF 40'S eo-Z¢ 
¥¢ 40°? Tg-0¢ 
> LI ze'¢s Seen Set Re sot st'¢ 6F-8P 13 L3°¢ LS-8F 
s Le OL °F rai SL°b |FE % IZ tPF 02 °¢ ee ne > me ee c¢cT co's L¥-OF or 00°¢ L¥-0F 
: 9 60°¢ 61 82'S (02 Put é6I Tor 02°¢ a iat Batata oat gs Lez 40°¢ St-tP ¥Z 99°¢ Gh-FP 
rs) 9Z 06°F £% ee°o (St pus Zt 661 86°F Tt-0F £2 ¥9°¢ TS-OF | Gee 21°¢ £h-ZP Lz €e°¢ eh-ZPp 
> 9eT 88°F E cs'> |9f pusct 9ZE €1'¢ 68-8E 2% 02°¢ 6E-8E | I&h It’¢ Tt-0PF oF ves Tt-0F 
A, elt 90°¢S £9 91'S |FI pu er 1c Z1°¢ Le-9F ce &2°¢ Ze-9£ | 99° €1'¢ 6E-8E 98 €2°¢ 6£-8E 
~~ coe Z1°¢ €01 LI°S (|2t PpuUsit Z8¢ 9I'¢ ce-re 6h 62°¢ ce-re | 902 02 °¢ Ze-08 LOI té°¢ Le-98 
S 092 It'¢ 9E2% 02'S (OT puss ecl L1'¢ €e-Ze 46 oes €f-ZE | 298 9I'¢ cere ZLt L2°¢ ce-re 
1 £96 rI'S SIé 02's (8 pus, 498 22° Té-0€ 9st €¢°¢ T&-08 | 226 vI'S e&-Ze FIZ ges e-Ze 
— OL9'T | 91'S OFS 2's 9 puse PIIT‘T | 02'S 62-82 19% It’? 62-82 | 6FI‘T | 62°9 T&-08 ore erg Te-0e 
8 €86‘T | gI'¢ LOL co's |b puve 682‘T | 22'S L2-92 ose sro 12-02 | TeZ‘T | LIS 62-82 csPr Ze°¢ 62-82 
sS 1ze‘s | 22°9 ors 82°o | pust ZePr‘t | 02'¢ SZ-42 2l¢ re's G2-#Z | BSZ‘T | 9I'S 12-92 LLg Té'¢ Lo-92 
as PST‘ | 12°¢ £9F ces 10 e6e‘t | ets 82-22% £9 $2°¢ €2-ZZ | 160‘T | ZI'S S2-42 029 cz'¢ SZ-42% 
— vss ors 861 82'S |I— puezZ — | OFI‘T | 10'S 12-02 FOL sI'¢ 12-02 | 622 90°¢ £2-22 80¢ Zt'°¢ €2-2S 
© cst ZI°¢ 99 12°g |€— pusy — | FL9 c6°P 6I-8T 6e¢ 00°¢ 6I-ST | OIF 88°F 12-02 eee c6°F | 12-02 
3S 89 06°F £% 24'S jG— pusg — | 802 98°F LI-91 6LT 68°F LI-OT | €IT 1o°¢ 6I-8T ToT £0°¢ 61-81 
= ce 02°¢ a | Ig‘g |L—- % SI-— | te +9 °F cI-Z!I 62 09°? SI-@I | 12 LL°Y LI-t1 LI cs’? LI-t1 
= i aa 
S 
gw  UAPIY9|wospTryo) wospyryo) weapyryo UWOIPTYO|VOIPTYO! og, [MPG UeIpTYo) 14 |OPIGO|woupyyo| 8. Bis adl UeIPTIY?! ose 
— jo 10q jo jo 10q jo Se jo 10q jo exo | 7° 4 jo en | 4 jo pam jo 40q yo en 
B -uinu | sez00s | -ulnu | 831008 os -uinu | 801008 . -uinu | sel1008 . -uinu | selzoos | *‘ "4 -ummu | solz0os | ‘ i 
Teq0y, | uveyy | [e70y, | uvow ae reqoy, | uvoyy | VON] pexo7 | uvoyy | VON | pegoz | awoyy | “2 | peyoz, | uvoyy | “~4 
ae Aquredsiqy 
UsIpITyo TV “A | Boq-ye1T “7 aeIPITy [IV “d UeIp[Iyo w10q-3sILT “OD ueIpiTy? IV “a UWeIp[Tgo Us0g-4eIT “VY 








218 


NGUGTIHH) dO DZONADITIALN]T GNV GOY IVINGUVG—'] WIavy, 








Parental Age and Intelligence of Offspring 219 


opinion presumably would favor the father being from two to five 
years older than the mother. In Table I, columns E and F give the 
distribution and mean intelligence of the children for each two years 
disparity between the ages of the two parents. This relationship is 
shown graphically in Fig.3. Curve E represents the mean intelligence 
of first-born children and curve F represents the intelligence of all 
children against the disparity between parental ages. The disparity 
between the ages of the two parents is indicated by the “‘Fathers’ age 
Minus the Mother’s age.”” When the father is the older parent the 
difference between the ages is indicated as +, when the mother is 




















5.8 ' , —_ 7, ' ' qr 
~ _E _— First Born Children 
ae F Ali Children 
v 
bad 
> 5.4 “ 
yp 
v 
+ 
4 
on * ] 
£ 
2 
» 
” 5.0 : 
ra 
@ 
o 
= 
7 -4 re) 4 8 12 4c 2 24 
to-13 Mother Older FatherQher tod4 
Fic. 3. r 


older the difference is indicated as — ; and 0, of course, indicates that 
both parents are of the same age. Considering both curves E and 
F, the mean intelligence of the children decreases as the disparity 
between the parental ages increases. The curves both show a more 
rapid decrease in the intelligence of the children when the mother 
is the older parent than when the father is older. At the extremes, 
in both + and — directions, the cases are so few that the curve is 
very irregular. However, the general tendency downward, indicating 
lower intelligence, is quite apparent. The curve of intelligence of 
the first-born children against disparity of parental age is higher 
than when the intelligence of all children is represented. Here again, 
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as with parental age and intelligence, the larger families having more 
representatives and lower intelligence ratings, lower the mean intellj- 
gence of the entire group. When the mean intelligence of only the 
first-born children is considered, the families of the professiona] 
classes have equal representation with the lower occupationa! groups, 
so the mean intelligence of the group is higher. 

The writer knows of no study indicating whether greater disparity 
between parental ages is more common between foreign born parents 
or between native-born-parents; between parents of the professional 
classes or between parents of the lower socio-economic groups; or in 
which groups the mother more often is the older. Such evidence and 
an occupational classification is necessary before the causes which 
are operative to produce the relationship between intelligence of 
child and amount of disparity between parental ages can be determined. 

The results of this part of the study indicate that the greater the 
disparity between parental ages the less favorable is the prognosis 
for the intelligence of the offspring. The prognosis, however, is more 
favorable if the father is the older parent than if the mother is the older 
parent if the amount of disparity between the ages of the two parents 
is greater than four or five years. 


CONCLUSIONS 


In general, as shown by intelligence tests, children born of very 
young parents are less intelligent than children born of more mature 
parents. Below the age of twenty-six to twenty-eight for mothers 
and thirty to thirty-two years for fathers, the younger the parents 
the less favorable is the prognosis for the intelligence of the offspring. 

The present study indicates that the nearer the ages of the two 
parents approach each other the more favorable is the prognosis for 
the intelligence of their children. The prognosis for the intelligence 
of the children is less favorable as the disparity between parental 
ages grows extreme. When extreme disparity exists between parental 
ages the prognosis for the intelligence of the child is better if the father 
is the older parent than.if the mother is the older parent. 

The writer presents. several possible interpretations of the results 
of the study but makes no claim to having established the factors 
which might be operative in producing the results as presented. Fur- 
ther study must be made to reveal which interpretation most nearly 
approaches the truth. 
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SEX DIFFERENCES: COLLECTING INTERESTS 
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In a previous article the writers have questioned the instinctiveness 
of the collecting tendency. They have mentioned also some of the 
difficulties that one encounters when he attempts to define the term 
“eollecting.”’ Definition of ‘collecting’ appears to be varied and 
inconsistent; it is nevertheless true that studies of the articles which 
children report that they actively collect may reveal certain intrinsic 
interests of children. Therefore, it seems logical that such interests 
might be used profitably in motivating school work. Collecting inter- 
ests appear to be associated intimately with the growing self; the recog- 
nized identity of the interest and the growing self (with consequent 
inner urge) should provide a genuine actuator of interest and conse- 
quent success in school work. Of course, numerous interests are 
undesirable; these should be recognized, sublimated, and redirected. 
Others, however, seem salutary manifestations of growth. These 
should have abundant opportunity for expression in the curricular 
activities. Genetic studies of behavior manifestations seem to the 
writers to be of inestimable value in guiding and developing school 
children. 

In a previous article,' the writers presented a list of approximately 
200 articles which children listed as ones they were actively collecting 
in 1927-1928. Further analysis of these data has yielded certain indi- 
cations of sex differences in collecting interests which appear to be 
significant. 

As a means of studying the sex differences in collecting interests, 
the writers listed the articles* that were collected more than twice as 
commonly by boys as by girls; they listed also the articles that were 
collected more than twice as frequently by girls as by boys. By this 





* These data were assembled by one of the writers and several graduate stu- 
dents. The table containing the articles most frequently collected, and the method 
of investigation may be obtained by reference to this Journat, Vol. XXI, 1930, 
pp. 112-128. 
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means two lists were obtained. The first list therefore includes only 
those collections which seem to appeal predominantly to girls; the sec- 
ond list includes only those collections which appeal strongly to boys, 

The articles which are collected much more frequently by girls 
than by boys appear in Table I; those which have a much more intense 
appeal to boys than to girls are listed in Table II. 

In Table I, the articles collected by the girls are classified under 
seven headings, namely: (a) Objects possessing esthetic appeal or 
value, (b) objects for personal adornment, (c) objects of sentimental 
appeal, (d) dolls, doll clothes, etc., (e) household accoutrements, (f) 
souvenirs (predominantly from the classroom), and (g) objects used 
in playing games. In Table II, the articles of unusual interest to the 
boys are assembled under six headings, namely: (a) Animal parts and 
insects, (b) junk (to sell), (c) tobacco souvenirs, (d) objects associated 
with war, hunting, fishing, etc., (e) objects used in playing games, and 
(f) miscellaneous ones. 

In examining Tables I and II certain facts should be kept in mind. 
In the first place one should recall that this presentation seeks to place 
in sharp relief sex differences, not sex likenesses. If this fact be over- 
looked, the reader might exaggerate the significance of the sex 
differences. For example, in Table I (a), the girls are shown to have 
collected more often than the boys objects of esthetic appeal or value. 
Nevertheless, one or more boys collected all but one of the items listed 
in part (a) of Table I. The items listed in Table I (a) are not articles 
collected only by girls; they are items which were collected more than 
twice as frequently by girls as by boys.* 

One must bear in mind also that these data were obtained within a 
rather restricted geographical area. A specific environmental back- 
ground is therefore reflected. For example, in Table II (a) it will be 
found that the boys collected furs, rabbits’ ears, gopher skins, etc. It 
is clear that city boys’ collections would yield no such collections. For 
this reason the sex differences herein reported should be interpreted 
with the realization that they obtain for a specific locality only. 
Nevertheless, when these data are viewed from the standpoint of the 
several groupings rather than from the standpoint of specific items, it 
seems probable that the sex differences may be representative of 
widespread and rather general tendencies. 





* The writers have previously discussed the fact of sex differences in esthetic 
appreciation. (See American Journal of Psychology, Vol. XL, July, 1928, pp. 
449-457.) The present findings corroborate the conclusions set forth previously. 
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Table I (6) presents the names of objects which are used by girls for 
self-adornment. Although most of these articles were collected to a 
limited degree by boys as well as by girls, they were collected with 
strikingly greater frequency by girls. Furthermore, only one similar 
object (necktie) appears in the boys’ list. Noticeable indeed is the 
fact that Table I (b) contains ten items. In the conspicuous interest 
of the girls in this type of collecting and the paucity of interest among 
boys there is evidence of one clearly defined sex difference in behavior 
trend. 

It is apparent readily that this attempt at classification is arbitrary 
and therefore somewhat inaccurate. For example, in Table I (c) are 
listed the names of objects which possess sentimental value. It is 
obvious that almost any cherished possession may have a sentimental 
value. It is equally true that practically any of the items listed in 
Table I may have been a gift or a token, and therefore may have 
acquired sentimental value. Nevertheless, if one inspects carefully the 
two lists, he will grant undoubtedly that the girls’ list contains many 
more objects than the boys’ which generally arouse affective reaction 
and sentiment. * 

Table I (d) sets forth certain items that are collected almost 
exclusively by girls, namely, dolls, doll clothing, ete. This sex dif- 
ference would scarcely be considered phenomenal. This list however 
considered with list I (c) brings to light conspicuous and apparently 
general interests of girls. 

From Table I (f), it will be noted that the girls collect souvenirs of 
the schoolroom much more commonly than do boys. No items of a 
similar nature appear in the boys’ list. (See Table II.) 

In the several lists presented in Table I (a) to (f), striking behavior 
patterns of girls appear. Girls are attracted by objects which may 
be used for personal adornment; they assemble and appear to cherish 
much more than boys objects which call forth affection and sentiment; 
and they collect and keep relics and portable objects which have been 
associated with school life. The latter tendency may be additional 


* It is of interest that the only items of the entire one hundred ninety that suggest 
superstitious belief, namely, charm strings, and four-leaf clovers, were reported 
more often by the girls than by the boys. Sex differences in this regard have been 
reported by the writers in a previous article. (See Journal of Abnormal and 
Social Psychology, Vol. XXIII, 1928, pp. 356-368.) 
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evidence of the girls’ rather intense liking for school. This liking ig 
less frequently found in the normal boy.* 

The objects collected by the boys suggest their out door activity 
and rather vigorous play life. This is particularly striking in the 
first five classifications of Table II. Perusal of this table shows that 
very few of the objects may be described as ones which possess sxs- 
thetic or sentimental appeal. . And the lists contain few objects which 
could be used for personal adornment. Few of the girls collected 
objects which, by any breadth of imagination, could be made to fit 
into the first four classifications of Table II. The things collected by 
the girls reflect the relatively narrow geographical radius which charac- 
terizes their lives. The boys’ lists, however, reveal their relatively 
great participation in activities which require a wide geographical 
sphere and unrestricted outdoor activity. 

The rather striking sex differences found in Tables I and II seem 
to the writers to explain partially a finding reported by Miss Burk 
more than thirty years ago. Miss Burk reported: 


The boys exceed the girls somewhat in finding and hunting, and considerably 
in trading and buying. The girls exceed the boys very greatly as passive recipients 
of outside assistance, in having their things given to them by brothers, sisters, 
parents, uncles, aunts and friends. But this excess of passivity on their part is 
not balanced by any special decrease in the method of finding, but it rather bal- 
ances the excess of trading, buying, and winning among the boys? (p. 194). 


The preceding comment is corroborated by Miss Whitley. 


The percentage of girls depending on gifts to add to their collections is higher 
at all ages than it is for boys, and consistently lower for trading? (p. 259). 


The items listed in Tables I and II are not easily classifiable 
according to the ‘‘reasons” which prompted the children to collect 
them. Nevertheless, careful examination of the lists enables the 
reader to identify certain of the reasons for the sex difference in 
methods employed in making collections. In the girls’ list the objects 
which possess sentimental value (See Table I (c)) are ones which 
possess also little or no practical utility. This statement is true 
regarding most of the items in the other lists, I (a) to (f). Many of 
the items possessing esthetic and sentimental appeal are so perish- 
able that they have little or no value for trading or selling. Girls 





* This also is indicative of a sex difference in attitude that the writers have 
previously discussed. (See Education, Vol. XLIX, 1929, pp. 449-458.) 
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appear to obtain many of their collections as gifts and boys actively 


assemble collections in order that sale or trade may ensue. 

Of course it is true that many objects collected by the boys are of 
little actual value or utility. But there is this striking difference 
between the girls’ and the boys’ lists. All ten of the items listed in 
Table II (b) have a market value* and many of them have little 
value other than this. The girls’ objects of personal adornment 
probably have some market value, but this value is no doubt usually 
outweighed by the affective appeal of the articles. 

There is one other aspect of this point which is worthy of mention. 
It will be noted that the writers found it necessary to include a “‘ miscel- 
laneous” grouping for the items collected chiefly by boys. The 
girls’ collections were more easily classified. than were the boys’. 
This was due doubtless in part to the greater variety of objects col- 
lected by the boys. The great variability of this type of behavior 
among boys is due in part to their relatively great freedom to take 
part in outdoor life (as compared with girls) and their relatively 
vigorous, unrestricted play life. 

Because they have pockets in which they can carry about more 
easily than girls certain of the miscellaneous objects collected by them, 
the boys are clearly in a better position than girls to satisfy whatever 
desire they may have to trade objects which they have collected. 
This desire is also more easily met by the boys because of the fact that 
in their spontaneous play life they traverse a wider geographical 
area than do girls. This fact has been previously commented upon 
by the writers‘ (p. 93). 

Although the above attempts to explain the sex differences in 
“method of and reason for collecting” are of interest, it is the judg- 
ment of the writers that such theorizing has less value than knowledge 
of the actually observed tendencies of children. If one accept the 
view that education is a matter of experience; that the curriculum 
itself is a series of experiences, it becomes at once evident that the 
boys and girls herein studied are not receiving the same sort of educa- 
tion. The pupils have themselves assisted in making their curricular 
differentiation, perhaps on the basis of genuine and deep interests. 
In any event, before the educator will be in a position to direct most 
eflectively the child’s education, he should acquaint himself with the 
child’s actual experiences. 





<7 is to be interpreted in terms of the juvenile nature of boys’ trading and 
Selling. 
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ERRORS, DIFFICULTY, RESOURCEFULNESS, AND 
: SPEED IN THE LEARNING OF BRIGHT AND 
DULL CHILDREN 


FRANK T. WILSON 
State Teachers College, Buffalo 


In a previous study! the learning of bright and dull children was 
compared in a task of learning how to win a game in which two players 
alternately draw either one or two from a given number of pieces 
with the object of winning the last piece. Success requires the appli- , Cae 
cation of the principle that one’s opponent must be forced to draw ioe 
from a multiple of three at each of his plays. It appeared in that ee 
study that for groups of children nine and twelve years of age, half 
of each group at a level of 70 to 80 IQ and half at 110 to 120, the 
average difference of 30 IQ points gave advantage as measured by 
amount of-werk,to the brighter children, and that the average differ- 
ence of three years favored the older groups. The difference in IQ 
seemed to be a little more important than that in chronological age 
since, as measured in the study, the bright nine eventually somewhat 
surpassed the accomplishment of the dull twelve who averaged 
practically the same mental age. 


I. EXPLANATION OF THE StTupyY 


This report gives the resuit of further study of the groups in this 
task in regard to the “‘Lost moves,” or errors made. Data for the 
groups are given in Table I, six additional cases having been included 














in the later study. 
Taste I.—NvuMBER OF CASES 
Nine years | Twelve years | 
Dull | Bright | Dull | Bright | Total 
ee et a s 8 11 7 34 
| MERE eeaee 2 a 7 10 7 8 32 
Total...000eseeeeee+] 15 18 | 18 15 | 66 














1 Wilson, Frank Thompson: ‘‘Learning of Bright and Dull Children.” Bureau 
of Publications, Teachers College, N. Y. 
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The following explanation will make the references to the procedure 
of the task intelligible to the reader. Complete explanations wil] 
be found on pages 12-15 of the first study. 

“Paper clips were used for the pieces. Each subject had thirty 
trials each day. A trial was considered to be the series of ‘draws 
involved in the reduction of any initial number (of clips) to 0, regard- 
less of whether S (the subject) wins or loses.’ Peterson, J. C.: The 
Higher Mental Processes in Learning. Psychological Monographs, Vol. 
XXVIII, No. 7, 1920, p. 3.” 

The learning was carried on for five successive days. The game 
was explained and illustrated to the subjects with three clips. The 
task was begun with four clips and each subject, after having won 
once with four clips, was told that he must win three times in succes- 
sion, the rule which constituted the criterion of successful learning 
of each step. The game continued with the four clips. Having won 
three times in succession with four the task continued in the same 
manner with 5, 7, 8, 10, 11, 13, 14, 16, and 17 clips, or as far as the 
subject worked in the one hundred fifty trials of the series. The 
experimenter was the opponent of the subject, and his draws were 
always so made that the subject would win if drawing correctly but 
lose if drawing incorrectly. That is to say, the subject could always 
win if he applied the principle. As far as practicable all other factors 
in the experiment were kept constant so that the records show in the 
main, it is believed, differences due to the differences in the subjects 
of the four groups. 


II. Totrat Lost Moves 


Taste II].—AveracGe NuMBER or Losina Moves 











Nine years Twelve years 
Dull Bright Dull Bright 
A ci cishies bese 0y pom 8 3 ae 81 82 59 
Seat hss aa ha sé% 87 84 79 79 
ME Fi his oo vie we aaa 89.3 82.7 80.9 69.7 

















Table II gives the average number of losing moves for each group 
and for boys and girls. There seems little question but that the bright 


1 Tbid.: P. 12. 
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twelve year old group has the best record and the dull nine the poorest. 
The slight numerical difference between dull twelve and bright nine 
gives no grounds for deducing a significant real difference. 

Differences between groups by sexes as shown in these data are 
uncertain because of the small number of cases for the sexes, but the 
comparisons are given as they indicate the desirability of further 
investigation. The most striking comparison is that of the bright 
and dull twelve year old girls, who made practically the same showing. 
The dull nine year old girls did nearly as well as the bright nine year 
old girls. In the case of the boys the differences are decided between 
the bright and dull groups at each age, while the two groups of the 
same mental age score the same. Comparing sexes it seems that for 
the bright twelve group the boys did very much better than the girls. 
In the other three groups the differences are probably so small as to 
be statistically valueless. 

Conclusion 1.—From the straight count of losing moves made it 
seems that success in the task is more probable for the bright older 
subjects and that mental age is the most significant factor. 


III. Same Lost Moves 


There was made next an investigation of successive same losing 
moves. Does the presence of differences in IQ and real age produce 
records for successive same losing moves which compare groups 
otherwise than do the records of total errors? Table III gives the 
data for the groups by sexes. 


Taste IITA.—AveraGeE NuMBER or Times SAME LosInG Moves WERE MADE 
Two or More TIMES IN SUCCESSION 

















Nine years | Twelve years 
— : i 
Dull Bright | Dull Bright 
a 14.8 7.5 | 10.3 6.3 
ta ae re 12.9 10.7 9.9 10.8 
LRA pean 13.9 9.3 10.1 8.7 








The figures should be read as follows: The average dull nine year’ 
old boy made the same losing move twice or oftener in succession, 14.8 
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times in his one hundred fifty draws; the average bright nine year 
old boy 7.5 times; etc. 

Something new seems to be present in this table. The bright nine 
year old boys score better than the dull twelve year old and the differ- 
ence is large. In fact the bright nine are nearer the bright twelve 
than they are to the dull twelve. The girls, again, however, make 
records nearly the same, except: the dull nine who are much poorer 
than the other girls. Their superiority over dull nine boys is not 
sufficiently certain to more than note. 

The significance of making the same losing move successively 
is not wholly clear. Questioning of subjects after experimentation 
brought out the information that sometimes the same moves were 
made in order to study them more carefully to discover other possibili- 
ties. Such a process may have been more frequent with the bright. 
Observation and introspection suggest that sometimes the same moves 
were made because the previous trial was not remembered correctly 
and the subject thought he was playing differently the second time. 
Such processes may have been more common with the dull. Table 
IIIB, although covering very few cases, emphasizes the weakness of 
the dull nine in repeating the same losing moves many times. 


TasLe ITIB.—Averace NuMBER or Times Same Losinc Moves WERE Mape 
In 3 To 9 SuccgEsstvE TRIALS 





Dull9 | Bright9 | Dull12 | Bright 12 





Same moves: 


SEE TET OTTTE 1.5 1.4 9 1.0 
ss 1.1 1 3 3 
SE RES ang a 3 0 0 0 
MTL, gos cree cescces 1 

NS ok xaaun wd 0 wae bs a 

















A complication in interpreting all the data of this study exists in 
the nature of the differences in the difficulty of the successive steps of 
the task. This is affected by the practice which goes on with learning 
each step as well as by the number of clips used in the various steps. 
The subjects had varying amounts of practice on the several steps 
depending upon how quickly each won three timesin succession. Some 
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subjects had much practice on the early steps because of failure to 
win the three times. Others had little practice in the early steps 
because they did win three times with only few losses. Yet the latter 
apparently doing better with the problem at first often found later 
steps just as difficult as those who had more trouble at first. For the 
bright, however, it should be noted that, as they progressed farther 
than the dull of the same age, they had to work with steps using more 
clips and, accordingly, having more possibilities in the way of moves. 
With those larger numbers of clips it seems that subjects intentionally 
repeated same moves in order to study the situation more thoroughly. 

Conclusion 2.—Analysis of successive same lost moves seems to 
indicate that brightness acts to reduce such errors. 


IV. DirricuLty or STEpPs 


The difficulty of the steps, which it has been noted just above, is 
uncertain, has been studied from the data regarding the number of 
lost moves by steps and groups. Tables IVA, IVB, IVC, and IVD 
show these data. 


Taste IVA.—Dirricutty or Steps py Groups MEASURED BY THE AVERAGE 
NuMBER or Aut Losinac Moves 




















Step tte rT is i einininui| era 
Dull 9............{ 2.9 | 2.2 |32.7]16.7/31.3| .5/1.5| .8| .7 
Bright 9.......... 6 | 1.2 | 6.3 | 12.5|39.1|16.4| 4.3/1.3] .8 
Oe... .7| 1.4] 7.6 | 15.8| 38.4/11.6/ 3.3} .9 1] 1.3 | .05 
Bright 12......... 4) .6|6.4| 8.5|27.8/11.9| 8.4 | 2.6] 2.7] .4 

} 


























TaBLE IVB.—Drrricutty oF Steps sy Groups MEASURED BY THE AVERAGE 
NuMBER oF Same Svuccesstve Losina Moves 











Step 4/5 |]7 | 8 | 10| 11] 13 | 14] 16 | 17 
Dull 9............| .46 | .33 |5.80 | 2.40] 4.33) .06| .26 | .00 | .20 
Bright 9.......... 11 | .06 | .66 | 1.56] 5.27/ 1.22) .22 | .11 | .06 
oe .11 | .11 | 1.00] 2.66| 4.39] 1.26) .50 | .05 | .00 
Bright 12......... .07 | .07 | 1.00) 1.27) 4.07|1.27| .73 | .07 | .13 
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Table IVD is an hypothetical record made up by converting the 
number of actual lost moves recorded for each group into what those 
figures hypothetically become if one hundred per cent of each group 
had been allowed enough trials to solve each step. The arithmetica] 
sources for Table IVD are found in Tables IVA and IVC. There is 


TaBLE IVC.—PERCENTAGE OF Eacu Group SOLVING THE VARIOUS STEPS or 








THE TASK 
| | | 
Step 5 | 7 | 8 | 10 | 1 | 18 4 16 | 7 
BeBe 0} 90] ey} 7| 7 | 7] 2 
Bright 9.......... 100 | 100| 100| 83 | 44 | 17 | 7 
ee OTE 100 | 100| 100; 67 | 28 | 1 | 1 | 7 | 
Bright 12........ | 100} 100} 100} 80 | 73 | 40 | 27 | 20 | 13 





























TasBLe I1VD.—Di1rricutty or Strers py Groups MEASURED BY THE AVERAGE 
NuMBER oF ALL Losinc Moves CorRRECTED FOR ONE HUNDRED PER CENT 
Sotvine Eacu Strep 

















Step 4 5 7 8 oim.i mi “ 
eg 2.9 | 2.2 | 40.9 | 25.0 | 447. 
Bright 9.............|. .6 | 1.2 | 6.8] 12.5] 47.1| 37.0 | 25.3 
Pree eee ie te eee Ae eee 
0 Re 4| 6 | 64) 8.5 | 34.8) 16.3| 21.0) 9.6 




















considerable probability that if all subjects had been permitted to 
work until each step had been solved as far as shown in Table IVD the 
actual records would be different as the practice obtained at one step 
would affect the work on the following steps. But even granting such 
probability Table IVD doubtless represents the facts in regard to 
comparative difficulty of steps more truly than either of the raw data 
in Tables IVA and IVB. If objection is made to this assumption, 
however, the raw figures may be referred to as the same relative dif- 
ferences and comparisons of groups are found in them, although not 
such striking ones. The figures for the larger numbered steps have 
little significance because few subjects reached them. It seems, then 
that attention is warranted chiefly to the first part of the series. 
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The dull nine group found steps four and five much more difficult 
than did the other groups. Step seven became a much more difficult 
problem for all groups than the preceding ones, but for the dull nine 
it appears to have been very much more difficult, only eighty per cent 
of the group ultimately solving it and the average number of lost 
moves at that step being 33, or corrected, 41. Step eight appears to 
have been easier for the dull nine, at least after the practice they had 
had on the previous steps. There is a slight drop in the percentage 
which solved this step, but even so the corrected number of lost moves 
shows a smaller figure for step eight than for step seven. For the other 
groups step eight seems to be harder than seven. The next step 
jumps the losing moves up to very much higher figures, and from the 
table of the percentage who solve step ten this one is Waterloo to the 
dull nine. Only seven per cent survive its demands. The corrected 
average number of lost moves thereby becomes 447. There seems to 
be no doubt but that the difficulty of step ten, even after the previous 
practice of the task, is very severe for all groups. The next step, eleven, 
is not quite so hard for the remaining three groups, although it elimin- 
ates more than another one-third of the dull twelve. For the three 
groups it is harder than any previous step, excepting step ten. Pos- 
sibly the succeeding steps are easier. The small amount of data for 
them does not warrant conclusions and none are therefore made. 

Conclusion 3.—Apparently all groups found step ten the hardest 
as it was met with in this series. Relatively it was very much the 
hardest for the dull nine. For all groups but dull nine, the preceding 
steps increased in difficulty and the succeeding steps decreased in 
difficulty. For dull nine step seven presented an extremely acute 
difficulty and step eight considerably less difficulty than either step 
seven or step ten. 


V. RESOURCEFULNESS 


Whether or not this term is the correct one to use, the data given 
in Table V show in a comparative way to what extent the different 
groups used successive same losing moves and other losing moves. The 
figures for each group give the per cent that the moves at each step 
were of the total number of the whole series of steps. 

The table should be read as follows: The dull nine made three per 
cent of its successive same losing moves at step four; two per cent at 
step five; etc. It would seem that, if the proportion of successive 
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same losing moves was greater than that of the other losing moves at 
the beginning of the task and if the proportion decreased toward the 
latter end, it would be reasonable to hold that subjects were aban- 
doning a method of little or no return for some other method of more 
promise. 

The figures in the table indicate a slight superiority of the bright 
groups over the dull, especially if the per cents are totaled for steps 
four to ten. Bright twelve, which made the best progress in solving 
the problem, has a much better record than dull twelve and the bright 
nine show a better score than dull nine. 


Taste V.—PerR Cent at Eacu Strep Tuat Same Losine Moves ARE or Tora 
SamME Losina Moves CoMPARED WITH Per Cent at Eacu Strep Tuat Ati 
Oruer Losinac Moves Ars or Torat ALL OrnerR Losinac Moves 





























Ste 
ha nid Total steps 
4/5|/7)8 || 1/13 snes 

Dull 9: Ve | | 

Same losing moves........... 3 | 2 | 43) 18] 32).. | Ve 98 

Other losing moves........... 3 | 3 | 37 | 20) 37)... | .. | 100 
Bright 9: 

Same losing moves...........| 1 | 1 ae go ge: 5 ae 83 

Other losing moves........... 1 | 2 8 | 15 | 47/| 21) 6 73 
Dull 12: 

Same losing moves........... 1} 1 | 10| 27 | 44; 13) 5& 83 

Other losing moves........... 1 | 2 | 10{ 20; 50| 14 | 4 83 
Bright 12: | ate 

Same losing moves........... 11] 1412| 15] 47/15] 8 76 

Other losing moves........... .4-a 9) 12 | 40/ 18| 13 63 




















Conclusion 4.—In the kind of ‘‘resourcefulness’’ shown in Table 
V brightness seems to be significant. 
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Table VIA gives the figures for the average number of total losing 
moves by sexes and groups for each daily set of thirty trials Table 
VIB gives the figures for the average number of same losing moves by 
sexes and groups for each daily set of thirty trials. This arrangement 
of the data shows something as to the rate at which errors decreased 
or increased during the series of one hundred fifty trials. Using the 
bright twelve year old boys as suggestive of the most efficient learning 
it is found that in both tables that group increases its average number 
of errors through the second and third days and then progressively 
decreases the number for the fourth and fifth days. With more or 
less irregularity either the opposite takes place with all the other 
groups, that is each tends to increase its errors from day to day, or 
they fail to materially reduce the errors. The bright nine year old 
boys offer the most irregularity. One is in doubt in regard to this 
group whether to believe the number of errors tends to increase 
throughout the first four days and then tends to decrease, or whether 
the apparent great difficulty of the fourth day really belongs to the 
third and fourth days and means that progress remains about the same 
after the second day. 

Conclusion 5.—Measured by frequency of errors it seems that 
brightness operates to decrease errors more quickly. 


SuMMARY DIscussION 


Comparison of bright and dull children in certain noted reactions 
of the complex process of the task of this study shows for the groups 
of the experiment that on the whole the older-and brighter children 
react more efficiently than the duller.and younger. The noted differ- 
ences are not, however, certain differences in all cases. The certainty 
seems best for bright twelve year old boys. 

Two observations may be made in view of these findings. Each 
has merit and the two lead to an hypothesis of importance. First, 
whatever real differences may exist between the four groups studied, 
each group and every \last subject in each group, made appreciable 
progress along the same lines of effort. Dull and bright, young and 
old solved steps in the problem as measured by a practical, common- 
sense criterion of every day kind, namely: Repeated success in getting 
the desired result—in/ this case winning the last clip three times in 
succession at each step. It took the dull and younger a little longer 
to advance from step to step, but by the same measure of success 
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they did advange several steps. They made errors, to be sure, but 
the brighter and older ones also made errors and the same kind of 
errors. individually some of the brighter ones, also, made more 
errors and progressed amore slowly than some of the dull ones. 

This observation of the general ability of the dull groups to pro- 
gress along the same lines as the brighter ones is an interesting one 
to keep in mind in view of the often heard statement that dull folks 
are not headed in the same direction as bright ones. 

The second observation offered is that perhaps the gross nature of 
the measurements used in the records failed to reveal the importance of 
the differences really present. Three illustrations of such apparently 
small but significant differences, are found in reputable mental tests. 
Gesell in his schedules for studying the mental growth of the pre- 
school child rates an infant who ‘‘unmistakably articulates” two 
words in addition to ma-ma or da-da as a very high nine months old 
child; while five words, only three more, places him as very high at 
twelve months. That means that at this period of life just three 
more words make a warrantable addition of 3314 per cent in estimated 
mental development. In the Stanford revision of the Binet test 
the exact repetition of 12-13 syllables is given for the fourth year; the 
repetition of 16-18 syllables for the sixth year; 20-22 syllables for the 
tenth; 28 syllables for the sixteenth. Thirty words in the vocabulary 
test place one at ten years; forty at twelve; fifty at fourteen; sixty-five 
at sixteen; and seventy-five at eighteen. The nature of the task 
studied in this report may very well be so complex and demanding 
that the small differences recorded in the data really mean great 
differences in ability. 

If this conclusion is tenable a following general conclusion is 
justified and, it seems, highly significant. It is, in words of the day, 
that the ability of the dull and young subjects in this complicated 
problem is not to be laughed at. How far they have gone from the 
true zero point can not be said, but if the small recorded difference 
between their ability and that of the bright is so much that it stands 
for a large real difference, then the differences between the arithmetical 
zero of these data and the points scored by the dull groups must stand 
for a truly great amount of ability, certainly for so much that in it is 
promise, not despair. 

Final Conclusion Dull nine and twelve year old boys and girls 
found the complex task of learning to win a game, success in which 
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depended upon the application of an underlying principle, more diffi- 
cult than did bright nine and twelve year old boys and girls, when 
their success was estimated in terms of frequency and repetition of 
errors. The dull did progress, however, and the bright children 
made the same kind of errors but fewer of them. 
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